New AI Avatars Can Blink, Have Realistic Facial Expressions

Using its underlying Express-1 model, Synthesia’s AI avatars generate realistic video content

Ben Wodecki, Jr. Editor

May 1, 2024

2 Min Read
Graphic displaying various AI avatars showing a range of emotions
Synthesia

Avatar developer Synthesia has introduced its latest generation of synthetic people, now capable of more expressive speech.

Founded in 2017, Synthesia’s solutions can turn text into business videos in minutes. The BBC, Nike and Google are among the companies that have used Synthesia to create custom avatars. Businesses can use them to onboard staff or in marketing materials.

Synthesia’s new AI avatars are powered by its new Express-1 model which enables them to generate responses more realistically. Its avatars now blink and perform facial expressions that align with what they’re saying.

The company touts its latest line of avatars as “digital actors” capable of reading text scripts in the way a human would, using appropriate tones and inflections.

The Express-1 model is powered by a series of pre-trained models that combined enable the avatar to understand the contents of a script.

Synthesia said previous avatars were limited by their pre-defined routines, restricting their performance.

With its new underlying model, Synthesia’s platform can predict facial movements and facial expressions required when speaking a piece of written text in real time, ensuring it uses the right intonations and emphasis when required.

If the model fails to produce the expected output, users can regenerate it until the desired result is achieved.

Related:AI Startup Roundup: New OpenAI Rival Raises $110M

“Whether the conversation is cheerful or somber, our avatars adjust their performance accordingly, displaying a level of empathy and understanding that was once the sole domain of human actors,” wrote Jon Starck, Synthesia’s CTO in a blog post.

“‎The generative capabilities of these new avatars also extend beyond mere motion. Their facial expressions, blinking and even eye gaze are now perfectly attuned to their speech. Expressive Avatars synchronize flawlessly with audio inputs, ensuring that every gesture and expression aligns perfectly with the spoken word,” Starck said. “This harmony of motion and sound elevates the realism of our avatars and captures every nuance of human expression, bringing our avatars to life like never before.”

AWS was among the companies given early access. Synthesia hosts its service on its cloud solutions.

Tanuja Randery, AWS’s managing director for Europe, Middle East and Africa said Synthesia’s technology has the potential to “deliver engaging business communications in many different languages and scenarios that simply wouldn’t be possible otherwise.”

Synthesia has measures in place to ensure its avatars are used responsibly, including restricting certain inputs and employing tools to check content credentials.

Related:7 AI tools to boost productivity and spark creativity

To date, Synthesia’s AI-powered platform has been used to generate more than 18 million video presentations across more than 130 languages.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like