MIT Researchers Develop Generative AI Tool to Boost Database Searches

GenSQL can connect and analyze to answer queries about products and can be easily integrated

Ben Wodecki, Jr. Editor

July 15, 2024

2 Min Read
A person using a computer to manage documents and information in a database
Getty Images

MIT researchers have developed a generative AI tool for databases that lets users analyze data and make predictions about future data or fill in missing information.

GenSQL is an extension of the Structured Query Language (SQL) programming language that integrates probabilistic programming with traditional database searches.

It lets users analyze existing data, make predictions about future data and fill in missing information by combining SQL with probabilistic models of tabular data.

The generative AI-powered tool lets business users ask complex questions that combine actual data with probabilistic reasoning, providing them with more nuanced insights about a product or service.

The tool is designed to let developers employ probabilistic modeling in databases without requiring prior expertise in probabilistic programming.

“With GenSQL, both typical users and experts can more easily and interactively query generative models to test their validity, both qualitatively and quantitatively,” the paper reads. 

“This division of responsibility between users, generative modeler and probabilistic programming system developers could potentially help our society more safely and productively broaden the deployment of generative models for tabular data.”

Related:Under the Hood: Understanding Data as the Foundation of AI Applications

Databases are becoming an increasingly important component in a business’s AI arsenal, providing a wealth of information vital for making decisions.

However, they can often be disparate as companies often have separate silos spread across, containing diverse types of data including text, images and video. 

Businesses also need staff with technical expertise to make sense of the vast silos of data.

The MIT team created GenSQL to simplify managing and analyzing data from various sources.

The researchers said existing probabilistic programming systems fail to support complex database queries and do not effectively combine tabular data with generative models.

They developed GenSQL to be easy to use. A user uploads his or her data and model to GenSQL which automatically integrates them. The user can then issue searches for several tasks, including data cleaning and synthetic data generation.

A user can also develop custom models for harmonization across different data sources.

Evaluating GenSQL, the researchers found the tool to be more concise and less error-prone at detecting database anomalies compared to prior systems.

The tool also increases speeds for conducting tasks by almost seven times due to its reusing optimizations, enabling faster run times.

Related:Gen AI is Raising the Popularity of Vector Databases

“Looking at the data and trying to find some meaningful patterns by just using some simple statistical rules might miss important interactions,” Mathieu Huot, the lead author of the GenSQL project told MIT News. “You really want to capture the correlations and the dependencies of the variables, which can be quite complicated, in a model. With GenSQL, we want to enable a large set of users to query their data and their model without having to know all the details.”

Read more about:

ChatGPT / Generative AI

About the Author

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like