New Alibaba AI Model Excels at Math, Outperforms Competitors
Alibaba's Qwen2-Math model demonstrates superior performance in complex mathematical reasoning, challenging AI industry leaders
Alibaba has developed a specialist language model to solve complex mathematical problems, outperforming flagship models from OpenAI and Anthropic.
Qwen2-Math is a math-specific version of Alibaba’s recently released Qwen2 model, which is capable of solving arithmetic and mathematical problems.
The open-source model comes in three sizes, ranging from small with 1.5 billion parameters to standard language model sizes of 72 billion.
All three models perform well on math-focused evaluations, with the flagship 72 billion version outperforming proprietary models such as GPT-4o and Claude 3.5 in math-related tasks.
Credit: Alibaba
“We hope that Qwen2-Math can contribute to the scientific community by solving advanced mathematical problems that require complex, multi-step logical reasoning,” Qwen2-Math’s GitHub repository reads.
Alibaba’s AI researchers wrote that they’ve spent the best part of a year “researching and enhancing the reasoning capabilities of large language models” to improve their ability to handle math problems.
AI researchers are increasingly focusing on mathematics as a key area of study to enhance model reasoning, believing that by advancing a model’s ability to reason through queries methodically, it can significantly improve its cognitive capabilities.
For example, Mistral’s recently released Mathstral model solves mathematical problems using multi-step logical reasoning, while foundation-level systems like OpenAI’s GPT-4o and Meta’s Llama 3.1 405B boast improved math abilities.
Alibaba’s new specialist model outperforms advanced models, including foundational systems on math-specific benchmark tests, including Math, MMLU Stem and CMath as well as Chinese math benchmarks like GaoKao Math QA.
Alibaba even created Instruct versions of the new math model, which achieved state-of-the-art performance levels compared with industry-leading models.
Credit: Alibaba
Currently, the Qwen2-Math line of models only supports English. However, Alibaba said it will develop bilingual versions to extend support to the Chinese language.
Alibaba said the math-specific models will also be enhanced over time to improve their ability to solve more challenging math problems.
The Qwen2-Math models can be accessed on GitHub and Hugging Face.
It uses the same license applied to Alibaba’s general line of Qwen, otherwise known as Tongyi Qianwen, models.
Users have a non-exclusive, worldwide royalty-free license to use the Qwen2-Math and can use it to power commercial applications with the strict caveat that it cannot be applied to products or services with more than 100 million monthly active users.
Businesses that want to use the model but go above the user count need to request a license from Alibaba Cloud.
About the Author
You May Also Like