Data quality remains one of the biggest challenges facing AI projects

by Max Smolaks 14 August 2019

American data labeling startup Alegion has raised $12 million in a Series A funding round, led by RHS Investments.

The company uses a combination of clever algorithms and human expertise to prepare datasets for use in machine learning projects.

Alegion said it will spend the money on expanding its Active Learning capabilities – a type of semi-supervised machine learning in which an algorithm is able to interactively query the human teacher when faced with classification issues. This should help minimize the amount of human time spent on boring and repetitive data labelling tasks.

“Artificial Intelligence’s insatiable demand for accurate training data can’t be provided through human power alone,” said Hank Seale, founder of RHS investments. “Alegion’s ability to supplement human effort with machine learning is strongly differentiating.”

In order to create accurate machine learning models, data scientists need increasingly large datasets – and any inconsistencies or errors will have a direct impact on the quality of the models created. According to Alegion’s own research, 96 per cent of data scientists have encountered data quality and labeling challenges in their work.

Alegion annotates raw data so it can be understood by machines, with the brunt of the task handled by proprietary software, assisted by human experts. Its customers include AirBnB, Walmart and Microsoft, to name a few.

The company is headquartered in Austin, Texas, and has a development office in Kuala Lumpur, Malaysia.

“Just as assembly lines incorporate power tools and robotics to enable scale, ML model development will require machines training machines to achieve the highest levels of model confidence,” said Nathaniel Gates, CEO and founder of Alegion.

“Our customers can first leverage human judgement to train their model and then watch as newly trained machines are incorporated that allow unprecedented scaling.”