What Lies Beneath AI Solutions?

Data powers Artificial Intelligence. When vetting AI solutions, services, or products for viability, include a conversation about the required data.

By
Rhonda O'Connor
Iceberg demonstrating the data challenges for AI solutions
April 24, 2024

There’s a huge roadblock ahead for enterprise-level Artificial Intelligence (AI) solutions. You may have hit it already. It’s the thing many companies have avoided, deferred, or have been working towards longer than they wish.

It’s the data problem

Finding the data. Cleaning it.  Classifying it. Normalizing it for more than one purpose. Then, making the correct data secure and accessible to only those who should have access. What you are after is a Data Management & Sharing Plan. This is foundational for any innovative change your business needs to make – including AI.

Still, it’s often overlooked when discussing new services and applications or changing existing ones to align with a modernized technology infrastructure that supports reducing costs and operations.

Data powers AI

You can easily find a SaaS company or consultant who can deliver a Proof of Concept (POC) or Minimum Viable Product (MVP) for an AI solution. You can even buy a Large Language Model (LLM) product.

My CIO, Brenton Rothchild, created a Project Experience chatbot in a few hours for the Marketing team to demonstrate that POCs are not hard to produce in less than 90 days. However, our chatbot hallucinated and provided incorrect information I could easily detect.

There is so much more beneath the surface of AI chatbots and AI-powered solutions, so much so that Rothchild created a diagram for me to demonstrate what lies underneath and explain the challenges companies can encounter. And now we’ve created an iceberg infographic to help companies and teams bring these conversations to the surface.

Surfacing the Conversation about LLMs and Data Fabric

The Iceberg infographic below denotes key stages a company must address for the long-term success of AI solutions. The stages just above or below the surface (LLMs and Data Fabric, denoted in green and yellow respectively) are often left out when discussing business use cases and POCs.

If every stage is addressed, your AI aspirations won’t stall out after the POC or MVP. Your LLM can access the correct data when it needs it to solve the business problem you identified.

Proof of Concepts

As mentioned, these are easy to demonstrate or achieve, but the quality and reliability rely on the Data Fabric and the continuous refinement of the LLM you build or buy.

Large Language Models (LLMs)

These stages can encompass the following activities:

  • Feedback & Iteration
  • Observational and monitored
  • Iterative training dataset modification

Deployment

  • Model instantiation/rollout
  • Access Control
  • UI/UX
  • Monitoring and logging

Training & Fine Tuning

  • Training process and supervision
  • Testing and validation
  • Reinforcement learning from human feedback (RLHF)

Training Data / Prompt Engineering

  • Business case mapping
  • Semantic tagging
  • Software integration
  • Dataset transformation

Data Fabric

Data fabric (not to be confused with Microsoft’s product, Fabric, which we’ve helped companies use to map out their data fabric implementation) is defined by Gartner as, “An emerging data management design for attaining flexible, reusable and augmented data integration pipelines, services and semantics. A data fabric supports both operational and analytics use cases delivered across multiple deployment and orchestration platforms and processes. Data fabrics support a combination of different data integration styles and leverage active metadata, knowledge graphs, semantics, and ML to augment data integration design and delivery.”

These Data Fabric stages can encompass the following activities:

Fabric Implementation

  • Normalization
  • Transformation
  • Governance
  • Lineage
  • Shared data management plan

Data Warehouse

  • Consolidating structured and unstructured data
  • Cataloging
  • Metadata tagging

Data Discovery

  • Assessment
  • Scoping

Tackling the Roadblock

Not every stage must be fully completed. This work is never done. However, when a company maps out an initial framework for an enterprise-wide architecture, they are positioned to scale, expand, change, and maintain AI solutions and any other type of solution needing data to make the right things easier for your company.

So, if you’ve hit this roadblock and have yet to find a way to remove it, use this infographic to build consensus and buy-in. The data problem has existed for quite some time. Creating the start of your data fabric removes the roadblock to business problems that need data.

Need Help Gaining Traction

Our team welcomes a conversation to get your data fabric journey headed in the right direction. Read our series of AI articles or consider our complimentary AI + Data Workshop.