by Paul Brunet
Artificial intelligence is no longer one of those future technologies promising to deliver “what ifs” beyond the business user’s and consumer’s wildest dreams. AI is already being applied to our everyday lives through smartphone voice-to-text capabilities, GPS features, and sharing economy apps like Airbnb. These technology advancements in AI are a result of the innovation made possible through the large amounts of data available today and most important, through machine learning (ML) algorithms doing clever things.
The foundation of AI, specifically ML for business advantage, comes down to well-understood, intensively curated, and trusted data. Ensuring trust in the data is best achieved through data governance.
Machine Learning as the Enabler of AI for Business Advantage
AI and ML are often used interchangeably. But it’s the ML (the compute methods or algorithms) that enables AI and makes machines smarter. As algorithm models are exposed to new data, they independently adapt, learning from previous computations to make decisions and produce results.
Quickly becoming more widespread in the business world, ML is an area of AI where we’re seeing some exciting things happen, with the potential to transform industries. For instance, in healthcare, ML offers the ability to ingest large volumes of data such as billions of medical records, cull through that data, and draw recommendations on potential diagnoses. This type of cognitive learning is revolutionary considering physicians can’t currently stay on top of the volume of data being produced.
As ML gains more traction, there are challenges to overcome to make the machines more efficient and effective. These include the specialized training ML requires, data privacy assurances as ML penetrates into more business applications, and, most critically, ensuring the accuracy and reliability of the data on which the models are based. Many business leaders will apply ‘black box’ efforts toward new product introductions or marketing campaigns, assuming the data is accurate. However, if the wrong data is applied, huge investments could be at risk. This challenge is so real that, in a recent KPMG CEO Survey, nearly 50% of CEOs are concerned about the integrity of their data on which they base their critical decisions.
Data Governance as a Key Component of your AI Framework
For business users to apply the predictions of an analytics model in their decision-making, they must be able to trust the data along with the algorithms themselves.
A sure place to start in building this trust is looking at data governance. With vast amounts of data coming from multiple disparate systems, an effective data governance strategy breaks down hidden or siloed data across the organization and empowers everyone to go beyond just producing and consuming data to trusting and using the data to optimize value through business analytics or AI applications.
Data governance offers a simple and direct way to ensure that you are using the right data, but also identifies data errors and quickly flags and resolves those errors to help maintain (and/or restore) the organization’s confidence.
To take this confidence one step further, a data catalog integrated with data governance empowers an organization with quick and efficient data discovery, so data users spend less time searching for the trusted data they need to feed into AI applications or models, and devote more time to creating and refining the models. Similar to Amazon, a sophisticated data catalog allows business users to shop for and find trusted data in one central location, while also viewing the complete meaning, lineage, and relationships of the data.
Through ML functionality, the catalog serves up relevant data based on previous searches; it makes specific recommendations for ‘data purchases,’ much like Amazon does for frequent shoppers. The catalog provides a valuable service to business users and data scientists because it’s reliable, convenient, fast and provides the trusted data they need for business analysis and decision making.
Additional catalog functionality links all sources of metadata – data sources, business applications, data lakes, data quality systems, data warehouses – into a responsive system. These connections enable changes to be detected and policies applied immediately, without manual steps. This ensures reliable data training is fed into the AI model, resulting in that greater efficiency addressed earlier.
We still have a way to go to discover AI’s true capabilities for the enterprise. But data governance and a data catalog offer a strong foundation for ensuring the trust and integrity of the data for broader AI and ML efforts to come.
Data governance is no longer about compliance, but is a discipline that can accelerate your AI efforts.
Paul Brunet is Vice President of Product Marketing with Collibra.