How Embracing Data Fabric Can Enhance the Value of AI Initiatives
Data fabrics and data mesh give end users easy access to information they need to support their work and decisions
To get the most business value from their tech and AI initiatives, data and analytics leaders are closely collaborating with their IT and CX counterparts to align on strategies for optimal results. Meanwhile their biggest asset—their data—is increasingly unstructured and growing exponentially.
According to IDC, 80% of enterprise data will be unstructured by 2025, but most business decisions are made on the remaining 20%—structured data. Many organizations struggle to operationalize and make sense of data locked in unstructured formats, so the role of enterprise knowledge management becomes paramount.
Data management is a key foundation for a successful AI strategy. Within this field, several trends are emerging around data architecture patterns like data fabric and data mesh. Organizations are taking these approaches to satisfy the need for business insight and achieve a structure optimized for security and governance. Data fabric has gained traction mainly because it's flexible enough to adapt to complex data growth and can also scale with the needs of a business. Data leaders must make this a foundation of their data strategy to make the most of their AI investments.
The Need for Data Context
With the advent of AI, including LLMs and AI-enhanced applications, knowledge management systems are critical to supply the data needed to feed the AI. Those tech and data leaders trying to make meaningful use of their data can share knowledge informally, but that approach doesn’t scale well. The metadata-centric approach, including metadata management, ontologies and knowledge graphs, is aimed at formalizing that knowledge. Unfortunately, these representations are frequently disconnected from the data that created the knowledge and often result in limited use.
Today’s business leaders aspire to solve complex data problems and achieve data agility. They want to connect, create, interpret and consume data, but not on its own—it must be linked with everything known about the data. This makes context essential; context such as where the information comes from, how it will be used and who will be consuming it. As a result, there is an increasing call to evolve enterprise architectures to incorporate a semantic layer so that live data can be interpreted at pace. It must be unified, complete and accurate.
Limitations of Existing Data Management Strategies
Just a few years ago, organizations used to copy data from places like their ERP or salesforce systems into a dedicated place for analysis. Then approaches moved towards using data lakes to harness unstructured data. Now that organizations have even more systems to unify, they need to make sure data is accurate, that it’s not duplicated when stored in more than one place and that there are no inconsistencies. While the key advantage of a data warehouse is delivering a single data source, its structure can be quite constraining.
When bringing in new systems from other places, the data warehouse must be fully up to date, but changes can take a long time to implement. This results in the data warehouse all too easily becoming out of date. Increasing amounts of data will be uploaded daily with some analytics requirements needed in real time. This means the data warehouse or data lake is collecting information from silos but, in essence, is acting as another silo.
What Data Fabric Is
Data fabric is one way to access organizational knowledge in a comprehensible fashion. By using a semantic layer, data fabrics weave data and metadata into a unified view that maps the information assets and enables on-demand access to reusable knowledge across the organization. This promotes better collaboration, less reliance on IT teams for data operations and greater self-service for data analytics.
The semantic layer minimizes complexity and unites data from many sources in a data catalog. Meanwhile, data schemas are automatically read to keep data up to date. This data is presented back as a knowledge graph or an information network of semantically linked facts that’s intuitive to data users and readable to Large Language Models.
Differences Between Data Fabrics and Data Mesh
Data fabrics and data mesh have a similar purpose: to give end users easy access to information they need to support their work and decisions. The data mesh focuses on producing data products specific to a domain or business unit that allows the people who best understand the data to independently manage and govern it, distributing ownership and promoting accountability and transparency within the business. With data fabrics, governance is managed on a more central basis, which is the core difference between the two architectures. The reality is that both approaches come with downsides and are difficult to implement close to the definition.
Most organizations will benefit from a composable, mix-and-match approach, depending on their metadata maturity and governance policies. It’s imperative to keep in mind the business objective of evolving an enterprise architecture is to establish a central point of access. In many cases, a hybrid approach between a centralized data repository, integration and security combined with federated governance would work best.
Gartner predicted that, by 2027, 30% of enterprises will use data ecosystems enhanced with elements of data fabric supporting composable application architecture to achieve a significant competitive advantage.
Benefits of a Connected Data Management Platform
As AI evolves, tech experts and business teams are still learning the associated issues and limitations. While structured data management, necessary for analytics, dominated the field for decades, AI feeds on unstructured data formats like documents, logs, video and more. This is now pushing organizations to prioritize qualitative data as a central pillar in their Data and AI strategies. Few platforms can work with all those data types, however. They will typically massage it into a form that’s right for them—perhaps into rows and columns.
An integrated, dedicated data management platform can work with the native data as-is. It can store it as graph data, rows and columns and lists, and it can create a knowledge model, business glossary and active metadata to enable the AI to work with all those types of records.
Enterprise organizations can deliver results faster with less risk using a single platform packed with capabilities. A data platform for enabling data fabric brings together elements that otherwise could require many vendors and open-source components to replicate. There are three key benefits for data managers to use a data management platform:
Connect - Integrate all your data and metadata in a single enterprise-grade data platform.
Create - Model your enterprise knowledge graph and use semantic AI to create metadata.
Consume - Deliver unified, high-quality data in context to its use case and audience.
The Challenge of Keeping Data Current and Accurate for AI
While AI and machine learning present a terrific opportunity for every organization, the biggest question is around what will keep LLMs or small language models honest. The answer is data.
A sizeable challenge with enterprise deployments of machine learning projects is keeping the data current and accurate. Teams must validate the answers for hallucinations, particularly in the case of tools like ChatGPT, and provide lineage and traceability of those answers.
This requires bringing back some elements of the answers to the data that was used to make that decision; meaning, linking elements of the answers back to trusted, accurate internal data that is managed. Data fabric doesn’t just use a standard answer—it uses an answer based on your enterprise data plus semantics.
Consider a retail use case like product traceability. It would be a struggle to trace an off-the-shelf product back to a distribution warehouse, a manufacturing line or a raw materials supplier the company investigated three months prior and had a quality control problem with. That supply chain visibility and transparency is something that is going to become increasingly important. Knowledge management at each step of the journey is critical.
Powering AI Potential with an Enterprise Data Service
The only way to achieve data agility and get optimum value from AI opportunities is by deeply integrating active data, active metadata and active meaning. Anyone who wants to connect those applications and deliver data value that spans multiple data sources should consider data fabric.
Once teams realize that the data is locked in the applications they use, architectures will invariably move from app-centric to data-centric. A single platform that keeps the meaning of data and metadata and the facts they contain together will improve an organization’s data management by becoming a powerfully harnessed resource.
Using this technology, CIOs and CDOs can expand their data universe with proper knowledge management and make more insightful AI-based business decisions with more of their data. Real-time pipelines will allow them to ingest, curate and consume data faster and achieve data agility so they can pivot and respond nimbly to change.
About the Author
You May Also Like