Hugging Face Acquires AI Software Startup to Boost Datasets

Spanish startup Argilla joins Hugging Face to help businesses build better NLP applications by improving their data

Ben Wodecki, Jr. Editor

June 20, 2024

2 Min Read
Digital drawing of Artificial intelligence chipset on circuit board
Getty Images

Open source code repository Hugging Face is acquiring AI software developer Argilla in a $10 million deal.

Founded in 2017, the Spanish startup built a collaboration platform for AI engineers to improve their data to improve natural language processing (NLP) applications.

Clem Delangue, co-founder and CEO of Hugging Face, previously invested in the startup and stated that Argilla’s mission aligns with that of his company.

“We can’t wait to onboard the whole team to double down on datasets, which have been growing faster than models on Hugging Face and which in my opinion are the most impactful topic in AI these days,” Delangue said in a LinkedIn post.

Argilla’s engineers created open source datasets and models to assist AI developers in better labeling and curating data. The startup aims to help businesses enhance their NLP applications by providing tools to tailor pre-trained models to specific use cases with improved data inputs.

Before the acquisition, Argilla collaborated with Hugging Face on various projects. It started as a launching partner for Docker Spaces, a virtual workspace where users can run and share machine learning models and other applications. It also released OpenHermes Preferences on Hugging Face, among the largest open datasets designed for training preference models or aligning language models to follow instructions.

Related:Hugging Face Launches New Code Generation Models

The close collaboration between the two companies made it feel like they were already part of the same team, Daniel Vila Suero, Argillas’ CEO and co-founder, wrote in a blog post.

“This acquisition means we’ll be doubling down on empowering the community to build and collaborate on high-quality datasets, we’ll bring full support for multimodal datasets and we’ll be in a better place to collaborate with the open source AI community,” said Vila Suero.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like