Case studies

Building a Scalable ETL Solution with Serverless Framework and Data Ingestion to Reduce Asset Onboarding Time and Increase Efficiency

About the client

Our client is a leading artificial intelligence company that offers innovative AI solutions to various industries, including aerospace, defense, energy, and finance.

Their solutions are designed to help businesses improve efficiency, reduce costs, and enhance their overall performance.


The client has an AI-powered platform that provides predictive maintenance solutions for industrial assets. These assets are sections within an Oil & Gas Plant managing Input, Flow& Processing. The data coming from these assets had to be integrated into the client's ETL software suite. However, the process of adding data assets to the ETL suite was time consuming and not scalable.

Business Challenge

Client’s Asset ETL Suite was composed of redundant coding strategies and ineffective code pipelines over Serverless Architecture. For the Onboarding of any new asset to the ETL suite, we had to code Its respective dependencies. In addition, there was no way to import any time-series data either from CSV or through sensor data. Ability to import and expose data through various data sources.

Our Approach and Solution

The Proposed Solution was to disintegrate the code, which involved separating the common ETL code from the Asset Systems code. This would enable the common ETL code to be applied to all the Asset Systems, while still allowing for differentiation between individual Asset Systems.

The ETL code was split and a common utility layer was created, which the lambda functions could inherit. This approach simplified the process of creating new ETL functions for different Asset Systems.

The team deployed the solution using the serverless framework and was successful in deploying both the pre-processing and post-processing stages of the ETL process.

To ensure efficient interaction with influxdb, a custom Python package in the form of a TSDS client was created. This client provided additional features to connect to Amazon Web Services (AWS) Time Stream, which is an important component for storing and managing large volumes of data. The custom Python package enabled other ETL services to interact with influxdb more easily and efficiently.

Overall, the solution provided an efficient and scalable way of disintegrating the code and ensuring that the common ETL code could be applied across all Asset Systems. The custom Python package provided an additional layer of functionality to ensure efficient interaction with AWS Time Stream and influxdb.

Tech Stack

Tech stack used: Python, Django, Influxdb, AWS RDS, AWS Lambda,  AWS S3, AWS IoT SiteWise, Serverless.


Business Impact

We were able to reduce code repeatability and significantly reduce asset on boarding time.
We Provide Custom Configurable Data Sources to allow historical time series data to be tested.
Predictive maintenance to reduce asset downtime.