Lip reading, an essential tool that helps the hearing-impaired to better understand the world, is now conducted by artificial intelligence with a better accuracy than done by humans, University of Oxford reveals.
In an article currently published by Quartz we learn that a new paper issued by the University of Oxford with funding from Alphabet’s Deepmind, reveals that they have developed an artificial intelligence system called LipNet that can read lips with an accuracy of 93.4%.
University of Oxford has previously released a system that operated word-by-word with an accuracy of 79.9%, but their new system has now developed a new and different way of approaching the problem.
“Instead of teaching the AI each mouth movement using a system of visual phonemes, they built it to process whole at a time. That allowed the AI to teach itself what letter corresponds to each slight mouth movement”, Quartz writes.
The new system was exposed to 29 000 3-second-videos videos labelled with the correct text to train the system, and in comparison with human lip-readers that had an average error rate of 47.7%, the AI’s error rate was only 6.6%.
Despite the research being very successful, the researchers also came face-to-face with the limits to the modern AI research. When the team was training the AI to read lips, using people facing forward, well-lit, and speaking in a standardised sentence structure, issues appeared regardless.
One particular sentence they experienced issues with was “Place blue in m 1 soon”, which consisted of a command, color, preposition, letter, number from 1-10 and an adverb. All the sentences followed this pattern, hence why the AI’s extraordinary accuracy could be related to the fact that it was trained and tested in extraordinary conditions, Quartz writes.
Therefore it is questioned whether the system would work as well if it was to read the lops of a YouTube-clip for instance. As of right now, a perfect data set does not exist yet, author Nando de Freitas says. OpenAI’s Jack Clark said that in order to get this to work in the real world it will require three major improvements: a large amount of video of people speaking in real-world situations, getting the AI to be capable of reading lips from multiple angles, and varying the kinds of phrases the AI can predict.
“The technology has such obvious utility, though, that it seems inevitable to be built,” Clark writes. However, the potential of AI to read lips and improving the life of the hearing-impaired is very essential and definitely a technology that is worth exploring and improving.
This article was first found at: http://qz.com/829041/oxford-lip-reading-artificial-intelligence/