AI's Trillion-Dollar Opportunity: Streamlining Enterprise Data Integration

The hardest part of B2B software work is data integration between systems and resolving data connectivity issues paves the way for project success

Paul Barba, Jeff Catlin

September 23, 2024

4 Min Read
Data workfows floating in space with a hand working on them
Getty Images

AI is everywhere these days and with good reason, as the ability of large language models (LLMs) to augment and speed up business processes is transforming how companies operate. Many of the scenarios discussed are what I’d call “application” use cases. These are things like using ChatGPT to write marketing content, summarize a call center recording or even provide code for a bubble sort algorithm in Python. These are impressive capabilities but if you look at each example you’ll notice they are specific generative asks. We use the LLM to provide a specific and relatively narrow business function. As AI improves, we’ll start seeing complex business processes leveraging AI, which will have a tsunami effect on business. 

One example of going beyond these generative application use cases is using AI to connect all the content sources that run our businesses. AI runs on data, as do all the companies I’ve ever worked for. We collect and generate data via emails, slack threads, Zoom calls, finance reports and myriad other business applications. This data is a treasure trove of information on how to operate and optimize a business, and there are analytics applications that excel at digesting and analyzing this data. 

At InMoment, we work on such applications and I can say with certainty that the most challenging part of our job is gaining access to all of these silos of information and figuring out how to normalize these disparate sources so we can look across the breadth of our customers' data. There is an unspoken understanding in the industry that the hardest part of B2B software work is the data integration between systems. In this field, resolving data connectivity issues often paves the way for project success.

Related:Revolutionizing Health Care Delivery With Artificial Intelligence

Imagine B2B applications where the AIs negotiate the data interactions between the businesses. This interchange is often called “Intelligent Data Mapping and Transformations.”  For large omni-source builds where a wide variety of data sources are pushed into an analytics application, this data integration work is often greater than 50% of the project. If we could reduce a two-month data integration project to two weeks, the economic value could be billions of dollars. At this point, the AIs still need our help to connect all these disparate sources and marry them into a cohesive collection. However, as LLMs become more sophisticated at using technologies like Langchain to connect LLMs, we are inching closer to a machine-to-machine data interchange.

It won’t all happen simultaneously, but we can now begin to deploy the first cut at such a system. “Text to SQL” applications use LLMs to translate natural language questions to database calls but require documentation of exactly how and where information is stored. The process of arriving at that understanding is exploratory and conversational. The process begins by identifying required data fields from vague user requirements. 

Related:AI in Aviation: Adding Value Without Replacing the Human Touch

Finding where that information is stored in vast enterprise storage systems takes poking around and asking users of various systems. Deciding if two tables can join on a particular key requires logical analysis and careful results debugging. While simple prompts won't lead to fully automated data engineering, more sophisticated LLM systems show promise. These advanced systems can explore data, ask questions, and eventually become expert sources for other LLMs to consult during data integration tasks. This capability enables the discovery and categorization of data sources, which can lead to the development of agent networks. These networks can enhance and reuse internal data sources, solving complex enterprise problems across data silos. Importantly, they can do this while maintaining appropriate permissions and incorporating revenue recognition.

The technology already exists to begin the process of machine-to-machine data integrations. Of course, there is more to the problem than just technology. Companies own these machines and have valid concerns about how data is exchanged. It’s not uncommon for the first few months of a project to be consumed by data security and compliance checklists and verifications. With machine-to-machine negotiations, the companies will be even more insistent on maintaining security. Then there is the whole “it’s scary having machines negotiating with each other” concern and I agree, it is a pretty daunting proposition.  

However, the financial benefits of this will likely win the day because, while precise dollar amounts are difficult to come by, we are talking billions, perhaps trillions, of dollars that companies spend on B2B data exchange each year. The last two multi-million dollar projects we’ve managed were dominated by data exchange and data engineering. On a six-month project, at least three months were spent on data work that utilized our most seasoned and experienced engineering resources. Cutting that down by even 50% would result in millions of dollars that could be saved on these projects, and that’s just our two most recent projects. The potential for significant cost savings is a compelling reason to embrace AI-driven data integrations.

Within two or three years, the massive economic value of AI-driven data interchange will drive the widespread deployment of these AI systems. This shift will accelerate and enhance the cross-enterprise analytics trend already underway and fundamentally reshape the business environment. As we move beyond today's specific, generative AI applications, we'll see AI transforming complex business processes and data interactions, ushering in a new era of efficiency and innovation across industries.

About the Authors

Paul Barba

chief scientist at InMoment, InMoment

Paul Barba is the chief scientist at InMoment, where he is focused on applying machine learning (ML), natural language processing (NLP) and artificial intelligence (AI) technologies to solve the challenges related to analyzing mountains of unstructured feedback data in the customer experience (CX) market. Paul has spearheaded the integration of generative AI and large language models (LLMs) into InMoment’s award-winning NLP stack and continues to drive that development as new capabilities come to market. 

Paul has nearly two decades of experience in diverse areas of NLP and machine learning, from sentiment analysis and machine summarization to genetic programming and bootstrapping algorithms. Paul is continuing to bring cutting-edge research to solve everyday business problems while working on new “big ideas” to push the entire field forward.

Paul earned a degree in Computer Science and Mathematics from UMass Amherst.

Jeff Catlin

executive vice president of AI products at InMoment, InMoment

Jeff Catlin is the executive vice president of AI products at InMoment, a leading provider of experience improvement solutions. With over 20 years of experience in AI, ML and NLP, Jeff drives innovation by integrating natural language processing with large language models to unlock enterprise unstructured data for global CX teams.

Before its acquisition by InMoment, Jeff was CEO of Lexalytics, a pioneering and global NLP leader. He also held roles at Thomson Financial, Sovereign Hill Software and LightSpeed Software.

Jeff graduated from UMass Amherst with a degree in Electrical Engineering and has a unique blend of technical expertise and business acumen, making him a respected leader in the AI and NLP space, continually working to enhance the capabilities of AI-driven solutions for unstructured data analysis.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like