Experts in AI: A deep dive into speech recognition

Plus, the first ever AI Business interview available as a podcast!

by Max Smolaks 17 January 2020

Speech recognition remains one of the most popular applications of machine learning, but that doesn’t mean the technology is well understood by an average business user.

To find out about the latest trends in this rapidly developing field, AI Business sat down with Sam Ringer, machine learning engineer at Speechmatics, a British company which develops automatic speech recognition software based on recurrent neural networks and statistical language modeling.

We talked about unsupervised learning algorithms, the cost of training for transcription systems, and approaches to data labeling at Tesla and Facebook.

As promised, here’s the podcast version of the talk:

“We train transcription systems at Speechmatics. Typically, some of our English models will be trained using about 5,000-6,000 hours of labeled data. And getting a hold of that data is hard. It’s expensive. You’ve got to make sure it covers all the domains that you’re interested in. Also, it can be mislabeled,” Ringer told AI Business.

“I think is underappreciated that humans really don’t need very much labeled data; we can just explore the world and pick up patterns in a very unsupervised way. So it is at least theoretically possible to come up with strong unsupervised solutions, or weakly-supervised solutions to these problems, because we can do it.”