August 30, 2023
Tesla has switched on a supercomputing cluster of some 10,000 Nvidia H100 AI chips as the company looks to ramp up the development of its full self-driving technology.
Tim Zaman, an AI infrastructure engineering manager at Tesla and X (formerly Twitter), confirmed the news, saying the cluster went live earlier this week.
The cluster is entirely hosted on-premises and is owned by Tesla, Zaman confirmed. Tesla engineers wrote their own storage network architecture for the cluster and the company’s engineers use a separate fabric for storage. “Literally a physically independent storage fabric only way to keep sane,” Zaman said on Twitter.
The cluster will be used to support the training of full self-driving technology. CEO Elon Musk wants Tesla vehicles to push away from lidar and instead use optical cameras.
The unit itself comprises 10,000 H100 GPUs from Nvidia. Each chip costs around $30,000 – putting the cost of the new cluster above $300 million.
The new cluster will be used to accelerate the company’s training efforts as Tesla’s AI supercomputer project, Dojo, slowly comes online. First showcased in 2021, Tesla announced that the supercomputer came online in June and training on neural networks began in July.
Musk recently revealed the automaker has spent more than $2 billion on AI training in 2023 and plans to spend a further $2 billion in 2024 on computing infrastructure for its full self-driving tech.
About the Author(s)
You May Also Like