Model is smaller in size and access is limited to researchers

Ben Wodecki, Jr. Editor

February 27, 2023

2 Min Read
Meta AI: LLaMA Language Model

At a Glance

  • Meta AI introduces LLaMA, a smaller language model that runs on less computing power.
  • LLaMA is designed for researchers to test language models for specific use cases.
  • LLaMA is trained on 20 languages that use Latin or Cyrillic scripts.

After OpenAI threw down the large language model gauntlet with ChatGPT, rivals have scrambled to catch up. Google has Bard. And now Meta has thrown its hat in the ring, with the unveiling of a new model, LLaMA, although it is only open to researchers.

The model, an acronym for Large Language Model Meta AI, is smaller in size than its contemporaries, as it is built for research communities that do not have access to large amounts of infrastructure. LLaMA is available in various sizes, ranging from seven billion parameters up to 65 billion parameters.

Despite its smaller size, however, LLaMA-13B outperforms OpenAI’s GPT-3 “on most benchmarks” despite being 162 billion parameters less, according to Meta’s paper outlining the models.

The largest model, LLaMA-65B, is reportedly “competitive” with models like DeepMind’s Chinchilla70B and PaLM-540B, the Google model used to train LaMDA, the underlying model for Bard.

LLaMA is a foundational model: It is trained on a large set of unlabeled data, which makes it easier for research to fine-tune the model for a specific task. And since the models are smaller, they are easier to retrain for use cases.

And LLaMA was not just built using solely English text. Meta trained its model using 20 languages that use Latin or Cyrillic scripts. However, most of the training data is in English so model performance for it is better.

Related:Google Unveils Bard: Its Version of ChatGPT

Smaller is better - for researchers

Meta’s researchers claim that access to current large language models is limited because of the size of the models.

“This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues, such as bias, toxicity, and the potential for generating misinformation,” Meta argues.

As well as making the models smaller, Meta’s attempt to make LLaMA more accessible includes releasing it under a non-commercial license.

Access to the various LLaMA models will only be granted to academic researchers on a case-by-case basis such as those affiliated with governments, civil organizations and academia. To apply for access to LLaMA, head here.

Like ChatGPT and others, LLaMA shares the issues other language models have of generative toxic comments and weird responses. Meta’s announcement of LLaMA acknowledges this, saying that by sharing the model, researchers can “more easily test new approaches to limiting or eliminating these problems in large language models.”

Meta’s research team also published a set of evaluations on benchmarks evaluating model biases and toxicity to show the model’s limitations and to support further research in this crucial area.

Related:OpenAI Launches Latest Viral AI Generation Model ChatGPT

LLaMA is Meta’s latest language model. Last May, the Facebook parent released OPT-175B, a large language model on par in size with GPT-3. OPT can perform NLP use cases including generating poetry and writing code, use cases for which ChatGPT and others have been touted.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like