AI2: OLMo Large Language Model is ‘Truly Open Source’
AI2, Microsoft co-founder Paul Allen's nonprofit research group, reveals all of OLMo's training data and the test used to evaluate it
At a Glance
- AI2, Microsoft co-founder Paul Allen's nonprofit research group, unveiled its large language model OLMo.
- AI2 calls OLMo a 'truly open source.' Users can access its model code, weights plus training code, data and evaluation suite.
- OLMo comes in four sizes, all around 7 billion parameters, putting it in competition with Llama-2 7B and Mixtral 8x7B.
The Allen Institute for AI (AI2) has released OLMo, which it describes as a “truly open source” large language model that companies can use to build applications – while knowing what went into building the model itself.
OLMo was published alongside its model code, weights as well as training code, data and evaluation suite, meaning users can see exactly how it was designed, trained and evaluated.
OLMo was built on Dolma, a dataset comprised of three trillion tokens that AI2 built. The model itself comes in four sizes all around seven billion parameters, putting it in competition with the likes of Llama 2-7B from Meta and Mistral’s Mixtral 8x7B.
AI2 was founded by Paul Allen, the late Microsoft co-founder. The non-profit research group said the new open source model would “empower academics and researchers to study the science of language models collectively.”
The research lab said that by providing access to the full underlying training aspects, OLMo would enable less carbon to be reduced when fine-tuning the model as an open approach “radically reduces developmental redundancies, which is critical in the decarbonization of AI.”
Opening such a system would also improve the rate at which researchers can work as they “no longer need to depend on qualitative assumptions of model performance,” AI2 said.
Big names like Meta have pledged to open source their AI systems but there are some disputes among AI experts as to what truly constitutes an open AI model. Just this week, an Amazon machine learning strategist questioned the definition of what it means for a model to be open source, during an industry conference in London.
There are other models that copy OLMo’s approach of total transparency, such as Bloom and Falcon. Eric Horvitz, Microsoft’s chief scientific officer and a founding member of the AI2 Scientific Advisory Board, expressed enthusiasm about this latest open model available to AI researchers.
Meta’s Chief AI Scientist Yann LeCun, in response to OLMo’s release, said that the community that comes from open source is “the fastest and most effective way to build the future of AI.”
Dolma and Paloma
Alongside the release of OLMo, AI2 also unveiled Paloma, a benchmark for evaluating open language models.
Paloma tests models on natural language processing tasks across multiple domains, ranging from niche artist communities to Reddit forums on mental health – making the testing tool capable of being applied to a multitude of areas other model testers likely would think to apply.
Companies can also make use of Dolma, OLMo’s pretraining dataset. Comprised of three trillion tokens from web content, academic publications and books, the dataset is generally available for commercial applications and can be accessed via the Hugging Face Hub.
Read more about:
ChatGPT / Generative AIAbout the Author
You May Also Like