New Alibaba AI Model Excels at Math, Outperforms Competitors

Alibaba's Qwen2-Math model demonstrates superior performance in complex mathematical reasoning, challenging AI industry leaders

Ben Wodecki, Jr. Editor

August 21, 2024

3 Min Read
Two people in front of a chalk board covered in calculations
Getty Images

Alibaba has developed a specialist language model to solve complex mathematical problems, outperforming flagship models from OpenAI and Anthropic.

Qwen2-Math is a math-specific version of Alibaba’s recently released Qwen2 model, which is capable of solving arithmetic and mathematical problems.

The open-source model comes in three sizes, ranging from small with 1.5 billion parameters to standard language model sizes of 72 billion.

All three models perform well on math-focused evaluations, with the flagship 72 billion version outperforming proprietary models such as GPT-4o and Claude 3.5 in math-related tasks.

“We hope that Qwen2-Math can contribute to the scientific community by solving advanced mathematical problems that require complex, multi-step logical reasoning,” Qwen2-Math’s GitHub repository reads.

Alibaba’s AI researchers wrote that they’ve spent the best part of a year “researching and enhancing the reasoning capabilities of large language models” to improve their ability to handle math problems.

AI researchers are increasingly focusing on mathematics as a key area of study to enhance model reasoning, believing that by advancing a model’s ability to reason through queries methodically, it can significantly improve its cognitive capabilities.

Related:Chinese Tech Leaders Demand Practical AI Application at World AI Conference

For example, Mistral’s recently released Mathstral model solves mathematical problems using multi-step logical reasoning, while foundation-level systems like OpenAI’s GPT-4o and Meta’s Llama 3.1 405B boast improved math abilities.

Alibaba’s new specialist model outperforms advanced models, including foundational systems on math-specific benchmark tests, including Math, MMLU Stem and CMath as well as Chinese math benchmarks like GaoKao Math QA.

Alibaba even created Instruct versions of the new math model, which achieved state-of-the-art performance levels compared with industry-leading models.

Currently, the Qwen2-Math line of models only supports English. However, Alibaba said it will develop bilingual versions to extend support to the Chinese language.

Alibaba said the math-specific models will also be enhanced over time to improve their ability to solve more challenging math problems.

The Qwen2-Math models can be accessed on GitHub and Hugging Face.

It uses the same license applied to Alibaba’s general line of Qwen, otherwise known as Tongyi Qianwen, models.

Users have a non-exclusive, worldwide royalty-free license to use the Qwen2-Math and can use it to power commercial applications with the strict caveat that it cannot be applied to products or services with more than 100 million monthly active users.

Related:How AI Will Change the Way Humans Invent

Businesses that want to use the model but go above the user count need to request a license from Alibaba Cloud.

About the Author

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like