Mistral Launches AI Models for Localized Code Generation, Math ReasoningMistral Launches AI Models for Localized Code Generation, Math Reasoning

Codestral Mamba can handle thousands of lines of code running on local devices, while Mathstral can solve complex math problems

Ben Wodecki, Jr. Editor

July 23, 2024

2 Min Read
Mistral AI logo on a smartphone in front of green binary code
SEBASTIEN BOZON/AFP via Getty Images

French AI startup Mistral has published two new specialist language models designed to improve code generation and math reasoning.

Codestral Mamba is a small model capable of generating code outputs quickly.

Despite being 7 billion parameters in size, the model can generate answers to code-related queries at speed, even when handling longer input texts.

Codestral Mamba can handle up to 256k tokens, equal to between 50,000 to 200,000 lines of code though input lengths depend on the programming language and coding style.

“We expect it to be a great local code assistant,” a Mistral announcement stated as its small size makes it ideal for local coding applications such as real-time code autocompletion, syntax error detection and personalized coding assistance.

In terms of performance, Codestral Mamba outperforms rival code-focused models like Google’s CodeGemma and even models almost five times its size like Meta’s CodeLlama.

It’s built using Mistral’s Mamba architecture, which differs from the traditional Transformer architecture found in most language models.

Instead of using attention mechanisms, a Mamba-based model uses selective state space models (SSMs), enabling it to process sequences linearly, meaning it can potentially handle much longer and larger inputs.

Related:Antitrust Regulator Drops Probe Into Microsoft’s Mistral Deal

Codestral Mamba can be tested on Mistral’s la Plateforme alongside the larger Codestral 22B.

The code generation model is available under an Apache 2.0 license, so users can create their own proprietary software and offer the licensed code to customers. It can be downloaded from Hugging Face.

MathΣtral: Solving Complex Math

Mistral launched another AI model this week. MathΣtral, or Mathstral, which can handle advanced mathematical problems that require complex, multi-step logical reasoning.

The model, named in tribute to Archimedes, is designed to understand and solve complex math problems, making it a possible aid for academics and scientists.

Mathstral was developed in collaboration with Project Numina and achieves state-of-the-art reasoning across various benchmark tests, according to the company.

The model achieved scores of 56.6% on the MATH benchmark and 63.47% on the MMLU test. Mastral’s scores increase further when it’s given more inference-time computation, the Microsoft-backed startup said.

“Mathstral is another example of the excellent performance/speed tradeoffs achieved when building models for specific purposes – a development philosophy we actively promote in la Plateforme, particularly with its new fine-tuning capabilities,” according to a Mistral announcement. 

Related:6 Latest AI Models from Meta, OpenAI, Apple and More

The model can be fine-tuned to improve its performance for a specific area of math or science.

Mathstral’s model weights can be accessed on HuggingFace.

About the Author

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Sign Up for the Newsletter
The most up-to-date AI news and insights delivered right to your inbox!

You May Also Like