Using Knowledge Graphs to Solve Data Integration Issues
An opinion piece by a director analyst at Gartner
November 1, 2022
Three of the top six barriers to AI adoption are related to data complexity, quality, and accessibility, according to the 2021 Gartner AI in Organisations Survey. Knowledge graphs – a graphic representation of entities in a linked network − have the potential to solve many data integration challenges that pose a significant barrier to the use of AI.
To successfully build knowledge graphs at an enterprise level, data and analytics (D&A) leaders must take an agile approach. Here are three key steps to follow to unleash the power of AI across the organization.
1. Start by focusing on knowledge graphs in targeted use cases
As with any data science, analytics, or AI initiative, find the right use cases before starting a project.
Frequently, knowledge graph initiatives start with a value proposition. Your organization will be able to join up enterprise-wide data that exists in silos, providing a platform for building applications that harness the inherent linkages and context that exist within a graph. However, be aware that the most common challenge will be the unwillingness or lack of business buy-in to invest in them because the benefits remain unclear.
Three of the most popular applications for knowledge graphs are semantic search and question-answering; knowledge discovery; and recommendation engines.
Semantic searches often arise in the form of a familiar infobox, which Google presents when searching for information as an output from its graph. From all perspectives, enhancing the search capability of an organization with a knowledge graph provides the ability to execute complex queries that refer to knowledge in multiple documents or sources using the relationships defined in the graph.
Meanwhile, knowledge discovery is an application of a knowledge graph to discover previously unknown or hidden information. For example, by modelling entities and relationships in a graph structure and with operational semantics, it can uncover potential treatments for patients, new materials that are more cost-effective for manufacturers, and fraudulent companies that are committing tax evasion.
Finally, recommendation engines are now a familiar component of many online stores, personal assistants and digital platforms. They have been commoditized to a large extent so that many e-commerce platforms’ insight engines and analytics tools include some form of it.
While the usage of graphs in these domains and subdomains brings benefits in itself, the real power of graphs is realized when they can be joined across domains to form an enterprise knowledge graph. This can then be used for applications such as a data fabric and digital twins, where business processes and decisions are replicated in a virtual environment.
2. Reduce time-to-value through agile practices
Many organizations seek to define an enterprise-wide schema, ontology or taxonomy first, but this is a mistake. Such endeavors are costly, time-consuming, filled with disagreement and, in many cases, stopped before any value can be shown or delivered. A knowledge graph continuously evolves, and for this reason, agile practices are particularly useful when developing knowledge graphs.
A combination of best practices for building knowledge graphs will lead to faster and more impactful results: Using existing standards, schemas and ontologies as starting points; extracting a list of key terms that need to be modeled; and adding handcrafted rules, entity attributes and relationships from business glossaries and data dictionaries.
The concept of a minimum viable product can be transferred to knowledge graph development by thinking in terms of the minimum viable graph (MVG) and minimum viable ontology (MVO). This means that only as many concepts and relationships will be defined (the ontology) as is needed to deliver some defined capability, corresponding to the instance data in the graph.
Composing an ontology in this modular fashion, while utilizing agile practices, provides a flexible and dynamic way to achieve enterprise standardization. Once this MVO is developed, it will be tested against the use case being delivered and any existing graphs. Instance data can then be populated against the MVO to create an MVG that can be expanded iteratively as more concepts are needed and defined. Using this method, it is possible to start small but scale quickly while delivering value.
3. Support a Minimum Viable Graph (MVG) approach in multiple channels
Once an MVO has been developed, test and use it by populating a graph of instance data based on the ontology. A knowledge graph development project should ensure adherence to the agile practices of ‘interactive’ and ‘incremental.’
Performing an analysis of the data held within repositories — both structured and unstructured — has become substantially easier with the introduction of data catalog solutions that employ machine learning techniques. These solutions can automate the process of discovering, inventorying, profiling, tagging, and creating semantic relationships between distributed and siloed data assets.
Knowledge graph development must be a collaborative process between business units and IT. Domain experts will give their insights into the entities and relations that form an ontology. Data scientists will look at how the ontology can be realized with the data available, and IT will need to ensure that the platform on which a knowledge graph is built is robust and scalable. Software engineers will utilize the knowledge graph to fill the data needs of data-intensive applications.
About the Author
You May Also Like