AI Business is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 3099067.

IT & Data Center

The 5 petaflops Nvidia DGX A100 hopes to run your AI workloads

by Sebastian Moss
Article Image

While HGX will do it for the cloud

What do you get if you take eight of Nvidia’s new A100 GPUs, a dual 64-core AMD Rome CPU, six NVSwitches, 15TB of Gen 4 NVME SSD, nine Mellanox 200Gbps Network interfaces, and package them all together?

Well, a bill for $199,000 - but also a lot of AI performance. Nvidia’s latest DGX reference architecture is the company’s preferred approach to shipping its highest performance chips.

The DGX A100, as the most recent iteration is named, is capable of five petaflops of FP16 performance, or 2.5 petaflops TF32, and 156 teraflops FP64. It also runs at 10 petaops (not flops) with INT8.

AI ready

“Nvidia DGX A100 is the ultimate instrument for advancing AI,” Jensen Huang, the ebullient company CEO, said as he unveiled the product during Nvidia’s now-virtual GTC.

“Nvidia DGX is the first AI system built for the end-to-end machine learning workflow - from data analytics to training to inference. And with the giant performance leap of the new DGX, machine learning engineers can stay ahead of the exponentially growing size of AI models and data.”

Among the first customers of the DGX, which has 320GB of memory for training large AI datasets, is the Argonne National Laboratory. Rick Stevens, associate laboratory director at the Department of Energy facility, said that the system would be used “in the fight against COVID-19.”

He added: “The compute power of the new DGX A100 systems coming to Argonne will help researchers explore treatments and vaccines and study the spread of the virus, enabling scientists to do years’ worth of AI-accelerated work in months or days.”

Nvidia has also released a version of the DGX on steroids: the DGX SuperPOD reference architecture. It's 140 DGX A100 systems all clustered together, capable of 700 petaflops of 'AI computing power.'

So far, the SuperPOD has just one customer: Nvidia. The company plans to install four of the pods as part of its internal Saturn V supercomputer, adding 2.8 exaflops of AI computing power, for a total of 4.6 exaflops. 

For cloud computing companies like Amazon Web Services, Google, and Microsoft Azure, there’s a slightly smaller option: The HGX A100.

It will feature four A100s, instead of the DGX’s eight.

Moving further down the power scale is the EGX A100, with just one GPU and a Mellanox ConnectX-6 SmartNIC, targeting the edge market.

Practitioner Portal - for AI practitioners


Open source ML framework Streamlit raises $21m, launches sharing platform


“It’s like we gave the machine learning community a new superpower,” CEO Adrien Treuille tells AI Business


Facebook launches Dynabench to test AI in realistic conditions


Looking to solve the problem of users messing with its algorithms

Practitioner Portal


More EBooks

Upcoming Webinars

More Webinars
AI Knowledge Hub

Experts in AI

Partner Perspectives

content from our sponsors

Research Reports

More Research Reports


Smart Building AI

Infographics archive

Newsletter Sign Up

Sign Up