Leveraging Full Potential of the Data Lake

Samuel Jaideep

Associate Product Architect

Hari Prasad

Associate Software Developer

Data Science teams at a large biopharma organization face multiple challenges while creating data packs, which is a prerequisite for their models. The prolonged data pack creation timeline occurs due to the lack of an intuitive way of building data packs and dependency on advanced skill sets. 

Data Analysts spend a lot of time on code-driven feature engineering and exploratory data analysis as data profile availability is low. They also have additional challenges like;

  1. Pulling data from varied data sources
  2. To be on par with the technical skills to build a model or generate desired insights
  3. Maintaining and tracking end-to-end processes from building pipelines to scheduling them
  4. Collaborating b/w fellow data scientists on the developed pipelines
  5. Keeping the infra cost lower through the process

D Cube Analytics has customized a product with a lot of additional features to not only address the foregoing challenges but also to spread across governance and security of the data, and we call it DDS IRIS™. 

DDS IRIS™ solution components are based on the following major pillars;

  1. Infrastructure & Security
  2. Data Management
  3. Collaboration

The Outcomes / Impact that DDS IRIS™  has created is listed below;

  1. Building data pipelines with an intuitive UI helps Data onboarding, Pipeline creation, and Scheduling the pipelines more quickly and efficiently. Heavy lifting and processing of the data is achieved by leveraging the power of Apache Spark running on Amazon EMR. It can leverage Databricks too. With DDS IRIS, the Pipeline creation was 3X times faster which turned out to be a big boost for the customer

  2. Processed data was persisted in Amazon Simple Storage Service and Databricks . We can write data to Amazon Redshift as well . DDS IRIS also enables user to define optimization techniques like partitioning, file compressions with just click of buttons. This meant that even non-tech savvy users could build efficient data wrangling pipelines. This was a big win for the client who already had a big team of analysts specializing in pharma domain

  3. To avoid wait time on reporting while published layer was still under construction, we leveraged DDS IRIS to export curated data sets directly to Tableau server. This capability by-passes multiple hops and intermediate storage and makes data available in the reporting server in a short time

  4. DDS IRIS comes bundled with a wide range data transformation and wrangling functions to slice and dice the data as the user wants. These functions perform at scale as it pushes down the computation to Amazon EMR. Here again we could use Databricks too. The key differentiator here was that we could even build some key domain specific and client specific computations into DDS IRIS to be used by broader team

  5. Users can bring in their created SQL workflows and onboard them on DDS IRIS™ as simple as few clicks and will be able to share it with the desired user with ease
  6. DDS IRIS™ gives you the end-to-end picture of the entire scope of the workflow from creation, publishing results, and scheduling workflow all in one place


D Cube Analytics is an Integrated Data Sciences company focused on extracting transformational insights from syndicated, real world and digital data to increase revenue realization, avert revenue loss, enhance internal productivity and improve end user experience for global Pharmaceutical organizations.

D Cube is pioneering a Digital Transformation wave within BioPharma by leveraging new age tools and methodologies like Artificial Intelligence, Machine Learning and Robotic Process Automation to greatly improve the productivity of workforce and significantly enhancing speed to insight. Through this new age product-based approach to delivering analytics, we greatly reduce the cost and complexity of deployments and provide measurable value across multiple business functions.

Find out how D Cube can help you to elevate your market access intelligence and develop rigorous strategies that enable success in the market, throughout the product life cycle.


Reach out to us at info@dcubeanalytics.com for information and questions.

Visit Us

D Cube Analytics
IndiQube Alpha Building
# 19/4 & 27, Ground Floor,
B2 & B3 Wings, Outer Ring Rd, Kadubeesanahalli, Panathur, Bengaluru,
Karnataka 560103

Visit Us

D Cube Analytics Inc.
1320 Tower Road,
Illinois 60173, USA

All Rights Reserved D Cube Analytics 2021