Startup Raises $46M to Revolutionize AI Dataset Curation

Founded by former Meta, DeepMind and Twitter staff, Datalogy wants to create a unified platform for AI data curation

Ben Wodecki, Jr. Editor

May 10, 2024

2 Min Read
Getty Images

DatologyAI has raised $46 million in a series A funding round to change the way companies build datasets for AI models.

Founded in September 2023, the California-based startup is building a platform that would allow businesses to automatically compile data for AI training.

Felicis Ventures led the funding round, joined by M12, Microsoft’s venture capital arm, the Amazon Alexa Fund and serial tech investor Elad Gil. Existing backers including Radical Ventures and Amplify Partners also participated.

The funds will be used to hire more staff, including researchers and engineers. Dataology will also increase its infrastructure to further power its platform.

“We are so grateful to our existing and new partners for their support and unwavering confidence in us as we pursue our audacious goal of democratizing data research for everyone,” said Ari Morcos, DatologyAI’s CEO and co-founder.

Language models, both large and small, require datasets from which to learn. The quality of information produced by a model is directly proportional to the quality of its input. For example, Sony research published last November found computer vision models trained on images of people with lighter skin tones would produce biased outputs toward those with darker tones.

Related:Six-Month-Old AI Startup Behind Devin Now Valued at $2B

Datasets required to build sophisticated systems like Llama 3 and Google Gemini require trillions of data points, which take considerable resources and time to compile.

“Models are what they eat, and the data models ingest determines everything about their capabilities,” Morcos said.

Datology is attempting to build a platform where companies building generative AI solutions can have more control over the types of data going into their models.

The startup’s data curation platform enables users to compile data, enabling enterprise development teams to compile their work on a simplified site.

The Datalogy platform is designed to help customers train models faster while enabling better performance from smaller systems that are competitive with much larger models, the CEO said.

To fund its efforts, this latest round brings Datalogy’s total capital to $57.5 million. Datology counts some of the biggest names in AI among its previous backers, including Meta’s chief AI scientist Yann LeCun, Google DeepMind chief scientist Jeff Dean and Geoffrey Hinton, the father of artificial neural networks.

The startup was founded by staff formerly of DeepMind, Meta and Twitter.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like