Meta unveils open source AI model that translates 200 languages

NLLB aims to make multilingual communication more seamless in the metaverse and beyond.

Ben Wodecki

July 7, 2022

3 Min Read

NLLB aims to make multilingual communication more seamless in the metaverse and beyond.

Meta has released a new AI model that can translate over 200 different languages. This breakthrough is designed to make online content accessible to people in their native language – so they can communicate more readily in the metaverse and elsewhere.

NLLB-200 (No Language Left Behind) is designed to improve machine translation capabilities. According to Meta’s AI researchers, the model achieved on average 44% higher translation quality than previous AI research.

The model is built on FLORES-200, a dataset that enables researchers to assess this AI model’s performance in 40,000 different language directions.

Both NLLB and FLORES are available to developers via GitHub, along with the model training code and code for re-creating the training dataset. The model was posted under an MIT license, meaning users are required to preserve copyright and license notices.

Awarding grants

Meta is also planning to award up to $200,000 in grants for impactful uses of NLLB-200 to researchers and nonprofits with initiatives focused on sustainability, food security, gender-based violence and education.

Nonprofits interested in using the model to translate two or more African languages, as well as researchers working in linguistics, machine translation and language technology, are invited to apply.

“NLLB-200 makes current technologies accessible in a wider range of languages, and in the future will help make virtual experiences more accessible, as well,” according to a company blog post.

According to Meta, English, Mandarin, Spanish and Arabic "dominate the web" and NLLB will allow users to interact with content in their preferred language.

Potential use cases the company refers to include building digital assistants and creating subtitles for movies and TV.

But it’s another potential use case that aligns with Meta’s newfound focus that was also touched upon – the metaverse.

Almost a year on from its rebrand and shift towards virtual spaces and experiences, NLLB-200 could “help make the metaverse accessible to more people around the world" by allowing technologies and virtual worlds to be built with multiple languages in mind.

Figure 1: Translating a Catalan recipe into English (Image credit: Meta) Translating a Catalan recipe into English (Image credit: Meta)

NLLB is the latest in a series of AI models unveiled by Meta. Showcased in early June, LegonNN is designed to allow developers to reuse modules when building machine learning architectures.

The same month, its AI team released OPT-66B – an open source, 66 billion parameter version of its OPT language model. The differing-sized models let researchers study the effect of language model scaling, the company said.

And in late June, Meta, along with researchers from the University of Texas, published three open source AI models for audio-visual understanding of human speech and sounds in videos. The models are designed to improve acoustics for augmented reality experiences.

About the Authors

Ben Wodecki

Assistant Editor

Get the newsletter
From automation advancements to policy announcements, stay ahead of the curve with the bi-weekly AI Business newsletter.