AI & Data
Communications & Media

Business Intelligence Solution: Data Lakehouse

Trility helped this client clearly define enhanced Business Intelligence capabilities for a key customer that translated to the entire organization by architecting and building a data lakehouse that centralized and secured data storage, optimized compute power, reduced organizational silos, and ensured long-term cost savings and scalability.

Problem Statement

This client needed to enhance its process for Business Intelligence reporting for a key customer’s ground system operations. The existing method required manually pulling metrics and information from multiple systems to generate reports.

Initially, only a specific architecture solution was desired. However, once Trility and the stakeholders explored the problem, it was determined that publishing to a data lake and ensuring proper authentication was critical for the entire organization and not just one key customer.

Solution Approach

To streamline Business Intelligence capabilities, Trility collaborated with stakeholders to expand and better define the business and security requirements for the architecture and design of a data lakehouse and associated systems. A data lakehouse was the chosen path and combines the functionalities of a data lake and data warehouse.

Key elements for the data lakehouse architecture included data formatting for schema adherence and query optimization; metadata for compute engines; and compute engines for effecting querying.

Challenges the team and stakeholders overcame included: Establishing data governance and security, leveraging and integrating multiple technologies, optimizing performance, and ensuring long-term scalability, cost control, IAM management, and change management.

Due to the client’s government contracts, additional security requirements and expectations were considered for deploying to Secret Commercial Cloud Services (SC2S) in AWS GovCloud.

Outcomes

By centralizing data, the client had a single, standardized data format and language with tools and solutions to leverage. Not only did they save time on costly reporting for one customer, other teams benefited from the data lakehouse, and overall benefits include:

  • Cost savings due to compute-decoupled storage and the elimination of multiple separate data storage locations

  • Increased security due to no required data movement or data copying

  • Increased efficiencies with data scientists and BI teams sharing datasets

  • Optimized value of data through a shared platform and reduction of organizational silos

  • Long-term cost savings for reporting and BI across the enterprise

PROOF POINT: During the project, Trility observed the intended path for IAM management would be hard to maintain and become costly long-term. Trility recommended an alternate solution that was implemented and ensured a scalable, cost-effective way to manage identity and access management to the data lakehouse.

Project Attributes

  • Reduced COO
  • Reduced Technical Debt
  • Accelerate Delivery
  • Increased Automation
  • Increased Scalability
  • Reusable Patterns
  • Increased Capabilities
  • Increased Security

Technologies Used

  • Terraform
  • Trino
  • AWS Simple Storage Service (S3)
  • AWS Glue
  • KeyGen
  • Apache Airflow