Taking the lead in the cognitive computing race: data management today

by Jeani Harper 22 August 2019

The use of various types of cognitive computing in analytics been extremely well-documented, and quite deservingly so. Applications of machine learning, natural language processing, computer vision, and speech recognition abound, making artificial intelligence more approachable than ever.

With all the renown advanced analytics has rightfully earned, it may be easy to overlook the data management rigors required to make it accessible to the contemporary enterprise. The truth is, analytics is supported by less exciting fundamentals of data integration, data discovery, and semantic understanding of the multitude of data sources required for it to properly function.

Previously, there were barriers to data preparation that prevented analytics from becoming reliable, scalable, and flexible enough for big datasets. As Cambridge Semantics VP of marketing John Rueter observed, “In the software world, in the last 20-plus years, you’ve seen the emergence and growth of all these analytics vendors. And, what’s been trailing behind, and has been so hard to get to, is actually the work that’s required around modern data management and integration for analytics.”

Fortunately, that’s no longer the case. The essentials of data preparation have made significant strides in the past couple of years to consistently furnish the diverse data required, at scale.

By combining graph technologies with quintessential semantic understanding of data in a standards-based environment, organizations can not only suitably prepare their data to keep pace with advanced analytics, but actually use data management mechanisms to set the pace of their analytics development, too.

Supporting cognitive statistical analytics

The influx of machine learning is perhaps the catalyst spurring the data preparation specialists to keep pace with the demand for advanced analytics. The viability of machine learning, in turn, was one of the direct consequences of the big data age and the attendant technologies that made it practically synonymous with data itself.

“That’s really been a missing part in all of this,” Rueter admitted. “And it’s become a much more pronounced issue given the way that big data, the attempts of the data lake, we might even say the failed attempts of the data lake to satisfactorily address this.” The incorporation of graph technologies buttressed by semantic standards that align any data format or structure for seamless integrations has grown in response to such failures, contributing to a resurgence in data integration efforts that are “even more important when you move into the world of machine learning and analytics,” Rueter said.

Data blending

One of the reasons machine learning galvanized attempts to improve integrations across the increasingly decentralized data landscape is because of the tremendous amounts of training data this technology requires. When managing data for cognitive computing applications, “integration is even more difficult,” acknowledged Cambridge Semantics CTO Sean Martin. “You need much more data, and much more high-dimensionality data.” The data blending necessary for integrating at the scale of cognitive computing is helped by semantic standard environments that comprehend data meaning and relationships.

Aided by standardized data models, uniform taxonomies, and nomenclature revolving around business (not IT) terminology, these technologies leverage semantic graph settings for a nuanced understanding of how datasets relate to each other. “Graph solves the flexibility [required],” Martin said. “One of the other things it does, it kind of blends the semantics, which is another element that’s very important: the business meaning of the data with the data to make it easier to discover and easier to use.”

Enhanced data discovery, cognitive insight

The business understanding of what different data means, how datasets correlate, and how they relate to defined objectives is an integral means of improving the data discovery process. Moreover, it helps to put this vital aspect of analytics into the hands of the actual data users, as opposed to data scientists with a limited comprehension of business objectives. The blending of business semantics with data makes the data discovery process a trigger for ad-hoc integrations resulting in more intelligent, meaningful analytics—especially when statistical cognitive computing techniques are involved.

Such an approach “can discover or help you discover what data you’ve got, and then very quickly place integrations of that data that we call data products as sort of knowledge graphs that can be very quickly stood up in the cloud in a manner of a few minutes using Kubernetes cloud automation technology,” Martin said. “And then, that allows line of business to very quickly access that data in the tools that they’re used to, or to draw that data into machine learning systems.”


Jelani Harper is an editorial consultant servicing the information technology market, specializing in data-driven applications focused on semantic technologies, data governance and analytics.