Japan's Fugaku Supercomputer Powers Local AI Model Training

Fujitsu leads Japanese research team exploring GPU-free ways to train large language models

Ben Wodecki, Jr. Editor

May 15, 2024

3 Min Read
RIKEN

Fujitsu and a team of AI researchers have developed a large language trained on the Fugaku supercomputer.

Recent Omdia research found that Asian companies want generative AI solutions that align with local languages and values. The new Fugaku-LLM model was designed as a counter to U.S.-dominated language model development.

The Japanese researchers contended that the country needs computational resources for AI research as big tech companies snap up GPUs. Instead, they turned to Japan's most powerful computing asset to assist their efforts: Fugaku.

Housed in Kobe, Fugaku was once the world’s most powerful supercomputer before it was superseded by Frontier.

Unlike other supercomputers, Fugaku does not run on GPUs. Instead, it uses Fujitsu’s A64FX microprocessor, an Arm-based CPU. In total, Fugaku has more than 160,000 of those CPUs, with 7.6 million cores.

The supercomputer was used to train Fugaku-LLM, an open source AI model capable of handling both English and Japanese text.

Fugaku-LLM was largely trained on Japanese text over English data. The resulting model stands at 13 billion parameters, larger than the 7 billion parameter models that have seen wider deployment in Japan like Alibaba Cloud's Qwen-7B and RakutenAI-7B.

The new model achieved the highest performance on the Japanese MT-Bench for open source models trained using Japanese data.

Related:Supercomputer Rankings: Intel’s Aurora and Microsoft on the Rise

The model is also touted as “highly transparent” as it was built from scratch using the research team’s own data, rather than fine-tuning an existing open source model like Meta’s Llama 3.

Fugaku-LLM can also perform natural dialogue based on Keigo, an honorific language used in Japan when talking to someone older or more distinguished.

The new model can be used to develop commercial and non-commercial applications. It can be found on Hugging Face.

Users must agree to the terms of use, which include stipulations around properly managing licenses when redistributing.

They are also fully responsible for managing all legal and ethical issues arising from their use of the model, with no warranties from the developers regarding performance or accuracy.

Fugaku-LLM is also available via Samba-1, SambaNova’s enterprise-optimized generative AI platform.

Satoshi Matsuoka, director of the Riken Center for Computational Science, said the model’s inclusion on SambaNova’s platform would “make the achievements of Fugaku available to many people.”

Using a Supercomputer to Train a Language Model

Fujitsu partnered with the Riken supercomputer center which houses Fugaku, as well as the Tokyo Institute of Technology, Tohoku University and CyberAgent, among others, to develop the model.

Related:Fujitsu Uses AI, Underwater Drones for Ocean Digital Twin

To use the supercomputer for language model training, the team of researchers developed distributed training methods to optimize the performance of Transformer models on a large-scale computing system like Fugaku.

The training efforts involved 380 billion tokens consisting of text, math and code data processed using Fugaku.

Among the techniques employed included using Megatron-DeepSpeed, an open source optimization framework developed by Microsoft to improve the training of large transformer models efficiently and at scale.

The researchers found that using Fugaku increased the computation speeds during training by six times compared to traditional large language model training.

“The knowledge gained from these efforts can be utilized in the design of the next-generation computing infrastructure after Fugaku and will greatly enhance Japan's future advantage in the field of AI,” a Fujitsu announcement read.

In related Fugaku news, the supercomputer retained its spot as the world’s fourth most powerful supercomputer on the latest Top500 list published this week.

Fugaku had held the No.1 spot from June 2020 until November 2021 but slipped down the rankings to rival units including Frontier, Aurora and Microsoft Azure's Eagle.

Despite its hold on fourth place, the Fugaku supercomputer remains the highest-ranked supercomputing system outside the U.S.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like