Meta Opens Access to AI Projects, More Readily Available

Meta opens up research access to FAIR projects like multimodal Chameleon AI and tools to detect synthetic speech deepfakes

Ben Wodecki, Jr. Editor

June 24, 2024

4 Min Read
Meta logo on displayed at the Viva Technology show at Parc des Expositions Porte de Versailles
Chesnot/Getty Images

Meta is opening up access to several AI projects being developed by its Fundamental AI Research (FAIR) team, including the powerful multimodal model Chameleon and a method for detecting AI-generated speech.

The company said that by making some of FAIR’s research projects more widely available, it hopes to “inspire iterations and ultimately help advance AI in a responsible way.”

“These new AI model and dataset releases are part of our longstanding commitment to open science and I look forward to sharing even more work like this from the brilliant minds at FAIR,” Joëlle Pineau, vice president of Meta’s FAIR team said in a post on X (formerly Twitter).

Among the new releases is Chameleon, a generative AI model that can equally handle text and images as both input and output.

The model’s unified architecture enables it to generate high-quality visuals and text. Meta provided a glimpse at the model in May, showcasing that it’s competitive when compared to other multimodal models such as OpenAI’s GPT-4V, Mistral’s Mixtral 8x7B and Google’s Gemini Pro.

Meta has made only text-focused versions of Chameleon available for research purposes “at this time,” including a smaller 7 billion parameter iteration and a larger 34 billion version. 

While its researchers believe the “possibilities are endless” for Chameleon, the company has opted not to release the image generation version as “risks remain.”

Related:Meta’s Chameleon AI Model Seamlessly Handles Text and Images

To access the models, researchers have to agree to the model-specific license, which bars them from being used for commercial purposes. The license also bars users from accessing the model if they’re based in either Illinois or Texas.

Meta is also making available AudioSeal, a watermarking tool capable of detecting synthetically generated speech.

AudioSeal can detect AI-generated speech segments within longer audio clips. The tool can identify deepfake voices in localized audio at speeds up to 485 times faster than previous detection solutions.

According to Meta, the AI audio detection tool is suitable for large-scale and real-time applications.

Unlike Chameleon, it’s being made available under a commercial license, giving enterprises the ability to develop synthetic content detection tools.

“AudioSeal revamps classical audio watermarking by focusing on the detection of AI-generated content rather than steganography,” Meta said. “Unlike traditional methods that rely on complex decoding algorithms, AudioSeal’s localized detection approach allows for faster and more efficient detection.”

Another piece of audio-related AI research being made available by Meta is JASCO, a music generation model that creates tunes from text.

Related:Meta to Use Public User Data for AI Training, Allows EU Opt Out

JASCO, which stands for Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation, can take text prompts and turn them into music compositions.

For example, a user could ask JASCO to create an “80s driving pop song with electronic drums and synth pads in the background” and the model will output the desired audio.

While the JASCO model isn’t being made available yet, Meta has published a research paper outlining how the model works and a sample page containing a variety of JASCO-generated audio snippets.

The company says it plans to release the JASCO inference code as part of its AudioCraft generative AI music repository.

Away from audio, Meta’s FAIR team has also been working on multi-token prediction.

Traditional large language models are designed to predict the next word in a sequence. Meta’s researchers argue that while simple, it’s an inefficient process.

Instead, Meta’s researchers trained a large language model to predict multiple future words at once.

The model, which is being made available for research purposes only, can produce outputs more efficiently at faster speeds.

Meta hopes other research teams look at expanding on their multi-token work.

Other FAIR works being made available include DIG In, a tool that would let developers assess the diversity of their image generation models.

Meta’s researchers collected data on perceived geographic representation, including more than 65,000 annotations and more than 20 survey responses evaluating examples of AI-generated imagery. 

By making these evaluation tools publicly available, Meta aims to encourage more AI teams to prioritize geographic fairness and representation when training their image generation models.

Read more about:

ChatGPT / Generative AI

About the Author

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like