Alibaba Publishes Open Source AI Model for Commercial Use

New Qwen-7B AI model outpaces ChatGPT in Chinese tests

Ben Wodecki, Jr. Editor

August 8, 2023

3 Min Read
Alibaba Cloud logo
Credit: Alibaba

At a Glance

  • Alibaba has open sourced Qwen-7B, a 7 billion-parameter AI model, free for commercial use.
  • A scaled-down version of its Tongyi Qiawen model, Qwen shows stronger performance vs. ChatGPT on Chinese evaluations.

Alibaba’s cloud division is open sourcing two large language models in a bid to expand its AI reach – and they are free for most commercial users.

Alibaba Cloud is opening up access to Qwen, a smaller version of its Tongyi Qiawen model released in July.

The Chinese company is publishing a seven-parameter base version of Qwen (Qwen-7B) and a same-size version fine-tuned for conversational applications (Qwen-7B-Chat).

The base version of Qwen was pretrained on a corpus of both Chinese and English web texts, books and code totaling 2.2 trillion tokens.

Alibaba said the model was built to use a vocabulary that is “more friendly to multiple languages, enabling users to directly further enhance the capability for certain languages without expanding the vocabulary.”

The code and checkpoints are available for commercial use. Companies with fewer than 100 million monthly active users can use the models commercially for free. However, those with larger user bases need to request a license from Alibaba Cloud.

Researchers can also access the Qwen models via Alibaba Cloud’s AI model repository ModelScope, as well as Hugging Face.

The Qwen models require Python 3.8, Pytorch 1.12 and CUDA 11.4 and above to run.

Alibaba’s move follows Meta’s open sourcing of Llama 2, which is also free for most commercial users except for hyperscalers.

Related:AI News Roundup: Alibaba’s Bilingual AI Image Generation Model

How does Qwen-7B compare?

Alibaba Cloud measured the model against other AI models frequently in Chinese using the C-Eval benchmark.

It found that Qwen-7B bested other AI models in terms of accuracy, including OpenAI’s ChatGPT, Vicuna from LMSYS Org and Baichuan-7B from the Chinese AI startup founded by former Sogou CEO Wang XiaoChuan.

Qwen-7B’s average performance on C-Eval was 59.6, the highest when compared with other Chinese AI models. By comparison, Baichuan-7 B's    average score was just 42.8 and ChatGPT amassed a C-Eval performance of 54.4.

Alibaba Cloud also tested Qwen-7B using the MMLU benchmark for evaluating English comprehension abilities.

It scored 47.6 on STEM tests, 65.1 on social sciences and 51.5 on humanities. In comparison, Meta’s Llama 2 of the same size scored 36.4 on STEM, 51.2 on social sciences and 42.9 on humanities.

Qwen-7B also achieved competitive performance scores when compared against larger English-proficient AI models, with an average score of 56.7. The 13 billion parameter version of Llama 2 scored 54.8 and the 12 billion parameter ChatGLM2 scored 56.2.

The full results can be viewed on Qwen-7B’s Hugging Face page.

Chinese models galore

Related:Meta Offers Companies Free Use of Llama 2 Language Model

The Chinese e-commerce giant joins the growing number of companies looking for AI. Alibaba published a text-to-image model in June that can handle both Chinese and English inputs.

Alibaba Cloud also previously launched ModelScopeGPT, a framework designed to aid in AI tasks across language, vision, and speech domains through models available in its Model-as-a-Service (MaaS) platform, ModelScope.

Other Chinese companies looking to AI include Baidu, which released Ernie, its own conversational application akin to ChatGPT underwhelming reception in March.

However, new AI services released in China may have to go through rigid rules proposed by the country’s infamous internet watchdog, including potentially forcing businesses to obtain a license to release generative AI models like Qwen.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like