AI Business is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 3099067.

Marketing & Ecommerce

Speechmatics launches Autonomous Speech Recognition to rival tools from Google and AWS

Article ImageTrained on 1.1m hours of unlabeled audio data

Cambridge-based speech recognition firm Speechmatics has launched the Autonomous Speech Recognition engine.

The platform can detect voices regardless of accent and dialect – with the firm claiming it outperformed similar models from the likes of AWS, Google, and Apple.

“Our focus in tackling AI bias has led to this monumental leap forward in the speech recognition industry and the ripple effect will lead to changes in a multitude of different scenarios,” Katy Wigdahl, CEO of Speechmatics, said.

“Think of the incorrect captions we see on social media, court hearings where words are mistranscribed and eLearning platforms that have struggled with children's voices throughout the pandemic. Errors people have had to accept until now can have a tangible impact on their daily lives."

Understanding voices

Speechmatics was founded in 2006 by Dr. Tony Robinson – a pioneer in applying recurrent neural networks to speech recognition. The company launched its cloud-based speech recognition services in 2012.

It raised a total of $8.2m in funding across two rounds, most recently bringing in £6.4m ($8.8m) in a Series A in late 2019. AlbionVC and IQ Capital led that round.

Now, Speechmatics launched a new speech recognition engine that promises improved accuracy.

The startup said that when using datasets from Stanford’s ‘Racial Disparities in Speech Recognition’ study, its software bested other systems for African American voices, with an accuracy score of 82.8 percent compared to Google (68.7 percent) and Amazon (68.6 percent).

Speechmatics said its software also outperformed competitors on children’s voices – recording 91.8 percent accuracy compared to Google (83.4 percent) and Deepgram (82.3 percent).

Such accuracy equates to a 45 percent reduction in speech recognition errors – the equivalent of three words in an average sentence, the company said.

“It's critical to study and improve fairness in speech-to-text systems given the potential for disparate harm to individuals through downstream sectors ranging from healthcare to criminal justice,” said Allison Zhu Koenecke, lead author of the Stanford study on speech recognition.

Speechmatics’ technology is trained on unlabeled data direct from the Internet, such as social media content and podcasts. Using self-supervised learning, its software is now trained on 1.1m hours of audio.

“This delivers a far more comprehensive representation of all voices and dramatically reduces AI bias and errors in speech recognition,” the company’s launch statement reads.

Trending Stories
All Upcoming Events

Upcoming Webinars

More Webinars

Latest Videos

More videos


More EBooks

Research Reports

More Research Reports
AI Knowledge Hub

Newsletter Sign Up

Sign Up