Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
Competition for Meta’s Llama 2 and outperforms original LLaMA
A new open source language model has emerged for companies and researchers to use, with restrictions, that can handle tasks in both English and Chinese.
Baichuan-13B comes from Baichuan Intelligence, a startup founded by Wang XiaoChuan, former CEO of Sogou, a technology subsidiary of Tencent. The startup is regarded in China as a promising creator of large language models.
Baichuan has 13 billion parameters and is a souped-up version of an earlier seven billion parameter version. A base version was published, as well as a version focused on chat applications that can be used to generate copy for ads or answer user queries.
The 13 billion-parameter model is commercially usable, though companies seeking to use it have to apply for and receive “official commercial permission” by emailing the Baichuan team.
The startup’s researchers contend that its models are comparable to the ever-growing number of open source large language models available.
Notably, the Baichuan team compared its models to LLaMA, Meta’s original open source language model that has formed the basis for other open source models, including Alpaca and Gorilla.
For example, they highlight that Baichuan-13B was trained on 1.4 trillion tokens, some 40% more than the same-size version of LLaMA.
To achieve better inference performance, Baichuan-13B uses the ALiBi (Attention with Linear Biases) method, which allows Transformer-based models to consume longer sequences than what they were trained on.
When compared to the standard LLaMA-13B, Baichuan-13B exhibits an improvement in average inference speed (measured in tokens per second). In tests involving the generation of 2,000 tokens, the Baichuan-13B model was found to be 31.6% faster. On the C-Eval benchmark which measures a model’s performance abilities in Chinese, Baichuan-13B’s base and chat versions also outperformed the Chinese versions of LLaMA and Alpaca, as well as the popular Vicuna AI model.
Similar results were achieved when the models were tested on the MMLU benchmark.
Full results are accessible via the Baichuan-13B’s Hugging Face page.
The Baichuan-13B model was not, however, compared with Llama 2, the latest version of Meta’s popular open source large language model, which was released in mid-July.
The Baichuan researchers “strongly urge” users not to deploy Baichuan-13B for uses that “harm national social security” nor should it be used for “internet services that have not undergone appropriate security review and filing.”
The stipulations come as Chinese authorities are looking to tighten controls on generative AI including enacting rules stipulating that content generated by AI cannot subvert state power, incite secession or disrupt social order.
The Baichuan team said that it did "our utmost" to ensure compliance with the data used to train the model and that it will not take any responsibility for the use of its model for potentially problematic uses.
Read more about:
ChatGPT / Generative AIYou May Also Like