AI Business is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 3099067.

Baidu's Deep Voice AI Can Talk Like a Human Being

by Ed Lauder
Article Image

Baidu has now developed the world's most advanced speech synthesis AI ever, which they call Deep Voice, that can actually talk like a human being.

Before Deep Voice came around, Google's voice synthesis program, called WaveNet, was the most advanced in the world. However, Baidu have gone one better with their new AI called Deep Voice. Google's WaveNet was powered by their AI called DeepMind and generated speech through texts, however Deep Voice uses deep learning techniques to break down texts into phonemes, which are the small sounds needed to speak any language accurately.

Baidu's Deep Voice was developed in their Silicon Valley lab and is the biggest breakthrough in speech synthesis technology since it completely does away with the countless calculations going on in the background, which means that it can learn how to talk accurately in just a few hours without our help. This is thanks to the deep learning techniques the algorithm uses, and all the researchers needed to do was train Deep Voice accurately.

“For the audio synthesis model, we implement a variant of WaveNet that requires fewer parameters and trains faster than the original,” wrote the Baidu researchers a study published online. “By using a neural network for each component, our system is simpler and more flexible than traditional text-to-speech systems, where each component requires laborious feature engineering and extensive domain expertise.”

Of course, this sort of AI isn't anything new. They are present in most of our mobile devices and the simplest ones can be found in some modern alarm clocks and even your automated answering phone messages. Yet, where Deep Voice differs is that it can accurately depict free-flowing human speech as opposed to being pieced together using large databases of human voice recordings.

Therefore, Baidu's Deep Voice represents a big step towards their goal of creating a truly human-like personal assistant, as opposed to one using pre-recordings to mimic intelligence. Deep Voice will acctually be able to talk to you like a real human being. “We optimize inference to faster-than-real-time speeds, showing that these techniques can be applied to generate audio in real-time in a streaming fashion,” said Baidu's researchers in an interview with MIT Technology Review.

They continued, “To perform inference at real-time, we must take great care to never recompute any results, store the entire model in the processor cache (as opposed to main memory), and optimally utilize the available computational units.”

At the moment, Deep Voice is a bit too much for our current devices to handle, but given time, our phones, watches and tablets will be able to handle the AI. All we have to do is wait for that day to come when our devices will become capable of utilising Deep Voice.

Practitioner Portal - for AI practitioners

Story

AI and self-service business intelligence – competing or complementing concepts?

7/8/2020

One term – data analytics – having two meanings – AI and SSBI – this is the classic setup for misunderstandings, failing project pitches, and failed projects. But what exactly are the differences between AI and SSBI? And are they complementing or competing concepts?

Story

Open source platform aims to speed up autonomous car development

7/6/2020

Project ASLAN promises easy to install, fully documented and stable self-driving software for specific low-speed urban autonomous applications

Practitioner Portal

EBooks

More EBooks

Upcoming Webinars

Experts in AI

Partner Perspectives

content from our sponsors

Research Reports

9/30/2019
More Research Reports

Infographics

AI tops the list of most impactful emerging technologies

Infographics archive

Newsletter Sign Up


Sign Up