August 15, 2023
At a Glance
- Vicuna v1.5, the latest version of the popular language model from LMSYS Org, is now commercially available.
- Built on Meta's Llama 2 AI, Vicuna v1.5 has a larger context window and offers improved accuracy.
A new version of the Vicuna language model is now available for commercial use.
LMSYS Org unveiled Vicuna v1.5 earlier this month, showcasing that it is built on the new Llama 2 AI model from Meta.
The organization released the original version of Vicuna in April. It was a fine-tuned version of the first Meta LLaMA model and was not available for commercial use. Vicuna was largely fine-tuned on user-shared conversations collected from ShareGPT.
The team behind it said Vicuna has received over two million downloads on Hugging Face in July alone.
Vicuna v1.5 “follows the proven recipe and brings fresh enhancements,” LMSYS Org said announcing the update via X (formerly Twitter).
You can try out Vicuna v1.5 via the LMSYS language model test area and even compare it with the likes of Alpaca and MPT-Chat.
Vicuna now understands more
The new version of Vicuna has an extended context window – the largest version has a 16K context length, meaning Vicuna can handle 16,000 tokens or roughly 20 pages of text per prompt.
The team behind Vicuna uses ‘positional interpolation’ to achieve a larger context window.
Thought up by AI researchers from Meta, positional interpolation sees the position encodings of pre-trained large language models like Lllama 2 interpolated at neighboring integer positions as opposed to extrapolating outside the trained position. It's a complex concept - the paper outlining Positional Interpolation breaks it down: “Position Interpolation linearly down-scales the input position indices to match the original context window size, rather than extrapolating beyond the trained context length which may lead to catastrophically high attention scores that completely ruin the self-attention mechanism.”
Vicuna v1.5 now has 4K and 16K context lengths. The old versions had under 5K context lengths.
Each version of Vicuna comes separately – meaning you can choose to download your preferred parameter size with either 4K or 16K context length. Users can now access all weights and sizes via the LMSYS Org page on Hugging Face.
Vicuna v1.5 Performance
Vicuna v1.5 blows the original versions out of the water in terms of accuracy.
On the MT-Bench benchmark, which assesses a model's dialogue capabilities, the new Vicuna models scored better than their predecessors. The base 7B v1.5 scored slightly better than the old Vicuna 7B’s score of 6.00, while the new 13B version scored higher than the old 13B’s 6.39.
On the popular MMLU test, the new versions scored far higher than the initial version, with 13B-based v1.5 easily surpassing the old 13B's score of 52.1.
The newly unveiled versions with improved context lengths scored even higher on both tests. Only the 7B v1.5 16K scored slightly lower than the new base 7B model.
The full results are below.
Read more about:ChatGPT / Generative AI
About the Author(s)
You May Also Like