Big data project for
Insurance
About the Client
Our Client is a company that strategically deploys capital generated by their €900 million asset portfolio, primarily consisting of insurance services. Their objective is to efficiently analyze historical and current data from a wide range of sources, enabling them to assist their clients in achieving their investment goals. They allocate investments across both traditional and alternative asset classes, utilizing a wide array of exposures to maximize returns and to spread the risk.
Complication
The Client's existing system faced challenges in executing numerous jobs required for generating reports, both for regulatory purposes and for the analytical teams. Furthermore, the Client experienced issues when managing extensive computations required to accommodate rapidly evolving requirements and specifications. Consequently, they recognized the potential for enhancing the efficiency of decision-making processes pertaining to their €900 million asset portfolio. By partnering with Datumo, the Client benefits from faster report generation, cost savings, simplified platform management, and improved stability. These advantages contribute to a more efficient decision-making process and optimization of their asset portfolio.
The Value We Delivered
- Reduced Computation Time: By collaborating with Datumo, the Client achieves a substantial reduction in the time required to acquire financial data from external providers, cutting it down from as much as 6 hours to just a few minutes. Some core processing tasks were also optimized by 95% of consumed time and resources. This improvement enables a more flexible and efficient decision-making process.
- Cost Reduction in Cloud Services: The collaboration results in significant cost savings in cloud services, helping the Client reduce their expenses.
- Simplified Platform Management: Enhanced platform offers simplified introduction of new features, making the tool easily manageable. This allows for smoother integration of updates and enhancements.
- Decreased Platform Errors: The number of platform errors was reduced by 87% when collaborating with Datumo. These improvements ensure more reliable and stable performance.
Innovative solutions and advanced technologies
There are multiple steps in the Client’s process of utilizing data from various sources and producing valuable investment insights. Many API fetchers periodically acquire data about issuers and securities from a diverse range of providers. Datumo developers significantly improved those API scrapers by introducing proven solutions like asynchronous HTTP requests, better error and retry handling, and implementing code more generically, so new integrations could be easily added.
With data loaded into the data lake, Spark jobs perform multiple computations. As Databricks is the main service of the platform and Delta Lake is the storage layer, medallion architecture is used as a design pattern for data division. A lot of periodic ETL jobs transform the data, while other applications produce reports required for business intelligence and regulatory compliance. All of these processes are implemented as Spark jobs written in Scala. As Datumo experts are Databricks aficionados, we optimised the computations and storage performance.
We started with identifying bottlenecks and some computational heavy operations in the maintained Spark jobs. With our domain knowledge we were able to redesign the implementation in accordnace with Spark and Databricks best practices. Also, our Spark and Scala experience allowed us to refactor the code to a more generic and accessible form.
By improvements like removing some unnecessary Spark actions, replacing full tables scans with smart usage of Delta Lake tables metadata and auxiliary tables or utilizing the power of Spark caching, we managed to reduce resource consumption of some ETL jobs by 98% and speed up their execution time by 95%.
Knowledge Zone
Get to know us, discover our interests, projects and training courses.