February 27, 2023
At a Glance
- Meta AI introduces LLaMA, a smaller language model that runs on less computing power.
- LLaMA is designed for researchers to test language models for specific use cases.
- LLaMA is trained on 20 languages that use Latin or Cyrillic scripts.
After OpenAI threw down the large language model gauntlet with ChatGPT, rivals have scrambled to catch up. Google has Bard. And now Meta has thrown its hat in the ring, with the unveiling of a new model, LLaMA, although it is only open to researchers.
The model, an acronym for Large Language Model Meta AI, is smaller in size than its contemporaries, as it is built for research communities that do not have access to large amounts of infrastructure. LLaMA is available in various sizes, ranging from seven billion parameters up to 65 billion parameters.
LLaMA is a foundational model: It is trained on a large set of unlabeled data, which makes it easier for research to fine-tune the model for a specific task. And since the models are smaller, they are easier to retrain for use cases.
And LLaMA was not just built using solely English text. Meta trained its model using 20 languages that use Latin or Cyrillic scripts. However, most of the training data is in English so model performance for it is better.
Smaller is better - for researchers
Meta’s researchers claim that access to current large language models is limited because of the size of the models.
“This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues, such as bias, toxicity, and the potential for generating misinformation,” Meta argues.
As well as making the models smaller, Meta’s attempt to make LLaMA more accessible includes releasing it under a non-commercial license.
Access to the various LLaMA models will only be granted to academic researchers on a case-by-case basis such as those affiliated with governments, civil organizations and academia. To apply for access to LLaMA, head here.
Like ChatGPT and others, LLaMA shares the issues other language models have of generative toxic comments and weird responses. Meta’s announcement of LLaMA acknowledges this, saying that by sharing the model, researchers can “more easily test new approaches to limiting or eliminating these problems in large language models.”
Meta’s research team also published a set of evaluations on benchmarks evaluating model biases and toxicity to show the model’s limitations and to support further research in this crucial area.
LLaMA is Meta’s latest language model. Last May, the Facebook parent released OPT-175B, a large language model on par in size with GPT-3. OPT can perform NLP use cases including generating poetry and writing code, use cases for which ChatGPT and others have been touted.
Read more about:ChatGPT
About the Author(s)
You May Also Like