AI Business is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 3099067.

IT & Data Center

AWS reveals Trainium chip and adds Intel’s Habana to its machine learning zoo

by Louis Stone
Article Image

The chip wars are well underway

Amazon Web Services has debuted a custom chip created specifically for machine learning model training, called Trainium.

Launching next year, the hardware will be joined in the cloud by another AI-focused chip, Intel’s Habana Gaudi processor.

Machine learning as far as the eye can see

Trainium complements Amazon’s existing Inferentia chip, which handles the comparatively less computationally-intensive inference workloads required to run machine learning models.

The new chip focuses on the more difficult task of training the model, something that has been typically handled by GPUs. Over the last few years, a number of new processors have popped up to attempt and dethrone the GPU as a tool for machine learning, notably Google’s tensor processing units – but they are only available through its cloud service.

Amazon claimed that Trainium will offer the most teraflops of any machine learning instance in the cloud at the lowest cost, but did not provide any benchmarks.

By the time the chip launches, some time in the second half of 2021 as EC2 instances, Nvidia and AMD will likely have updated their GPU lineup, while Google may have new TPUs out. It is not clear if Trainium will be able to maintain its claim.

Then there’s Habana. Intel acquired the company in late 2019 for $2bn, and immediately killed off the Nervana AI chips it had previously developed.

Intel claims that Gaudi accelerators deliver up to 40 percent better price-performance than current GPU-based EC2 instances for machine learning workloads, but again specific benchmarks have not been published.

Slated for the first half of 2021, EC2 instances will feature up to eight Gaudi accelerators per server. An 8-card EC2 instance can process about 12,000 images-per-second while training in the ResNet-50 model on TensorFlow, Intel claims.

“We are proud that AWS has chosen Habana Gaudi processors for its forthcoming EC2 training instances,” said David Dahan, chief executive officer at Habana.

“The Habana team looks forward to our continued collaboration with AWS to deliver on a roadmap that will provide customers with continuity and advances over time.”

Neither AWS nor Intel have commented on which of their processors has a better price-performance ratio.

EBooks

More EBooks

Latest video

More videos

Upcoming Webinars

Archived Webinars

More Webinars
AI Knowledge Hub

AI for Everything Series

David Hardoon explaining recent developments in Data Science and AI

Author of Getting Started with Business Analytics: Insightful Decision-Making and the forthcoming book, Creating a Data Culture: Failing to Succeed

AI Knowledge Hub

Research Reports

More Research Reports

Infographics

Smart Building AI

Infographics archive

Newsletter Sign Up


Sign Up