August 3, 2020
As measured by the MLPerf benchmark consortium
Google has used 4,096 Tensor Processing Units (TSUs) to build a supercomputer that it claims outperforms any other AI training system in operation today.
A series of benchmarks by the MLPerf consortium agreed with the statement, but also gave top marks to Nvidia for chip performance.
Build it and they will program
Google's TPUs cannot be purchased, only rented via Google Cloud, while the record-breaking machine is only available internally – and uses the fourth generation TPU design that is yet to be deployed in Google’s cloud data centers. Due to the lack of wider availability, MLPerf ranks both as research projects.
Nvidia's GPUs, meanwhile, can be bought by anyone, so are categorized as commercial. Both companies dominated their respective sectors.
Google said that its system delivers over 430 petaflops of peak AI performance, and also features hundreds of CPU host machines, connected via an ultra-fast, ultra-large-scale custom interconnect.
"Training complex ML models using thousands of TPU chips required a combination of algorithmic techniques and optimizations in TensorFlow, JAX, Lingvo, and XLA," Google AI's Naveen Kumar said.
Nvidia, meanwhile, highlighted how its A100 GPUs outperformed Google's third generation TPUs in some benchmarks. The company sells supercomputers-in-a-box called SuperPODs that can feature up to 2,048 A100 chips.
The A100 outperformed its predecessor, the V100, by 1.5-2.5x depending on the benchmark.
Some AI chip startups declined to take part in the competition, including Cerebras and Graphcore.
“We were the only company to submit across all benchmarks with available systems,” Paresh Kharya, senior director of product management, data center computing at Nvidia, said.
You May Also Like