November 15, 2023
At a Glance
- Microsoft unveiled two new chips, one specially made for intense AI workloads, to better control its infrastructure stack.
- Maia is a new line of AI accelerators while Cobalt is its Arm-based CPU meant for general purpose cloud workloads.
- Microsoft said developing its own chips makes it easy to control the cost and performance of its cloud hardware stack.
Microsoft today unveiled its first two custom, in-house chips including an AI accelerator designed specifically for large language models. The tech giant said developing its own chips would let it “offer more choice in price and performance for its customers.”
At the company’s Ignite event, CEO Satya Nadella showed off Maia, its first internally developed AI accelerator chip, and Cobalt, its first custom, in-house CPU meant for general purpose cloud workloads. Both chips are set to be available to customers in 2024.
Alan Priestley, vice president analyst at Gartner, said it makes sense for Microsoft to join other hyperscalers who have developed their own AI chips. "Deploying large scale infrastructure to host large language models like ChatGPT is expensive and hyperscalers like Microsoft can leverage their own custom-designed chips, optimized for these applications to lower operational costs – reducing cost to consumers and businesses that want to use these large language models."
Maia, the AI accelerator
The Maia 100 AI Accelerator is designed to power internal AI workloads running on Azure. Microsoft enlisted the help of OpenAI, its strategic partner and maker of ChatGPT, to provide feedback on how its large language models would run on the new hardware.
Sam Altman, CEO of OpenAI, said in a blog post: “We were excited when Microsoft first shared their designs for the Maia chip, and we’ve worked together to refine and test it with our models.”
Microsoft had to build racks specifically for the Maia 100 server boards. These racks (pictured below) are wider than what typically sits in the company’s data centers. The company claims that the expanded design “provides ample space for both power and networking cables, essential for the unique demands of AI workloads.”
Next to the Maia racks are “sidekicks” that supply cold liquid to cold plates that are attached to the surface of Maia 100 chips, to remove heat.
"We've designed Maia 100 as an end-to-end rack for AI," Nadella said at the event. "AI power demands require infrastructure that is dramatically different from other clouds. The compute workloads require a lot more cooling as well network density."
Microsoft is already working on the next generation of Maia AI chips. Pat Stemen, partner program manager on the Microsoft AHSI team, said in a blog post: “Microsoft innovation is going further down in the stack with this silicon work to ensure the future of our customers’ workloads on Azure, prioritizing performance, power efficiency and cost.”
Cobalt CPUs to power general purpose workloads
Cobalt CPUs are built on Arm architecture and is optimized for greater efficiency and performance in cloud native offerings. These chips already are powering servers inside Microsoft’s data center in Quincy, Washington (pictured below). Each chip has 128 cores and is designed to use less energy.
The company is using Cobalt for general purpose compute workloads, like Microsoft Teams and SQL servers, but is also planning on expanding its scope to virtual machine applications. At Ignite, Microsoft highlighted virtual machines from AMD that are optimized for AI workloads. The Azure ND MI300x v5 Virtual Machine features AMD’s Instinct MI300X as it is designed to support AI innovation for enterprises including AI model training and generative inferencing.
The goal of making custom chips
Rani Borkar, corporate vice president for Azure Hardware Systems and Infrastructure (AHSI), said in a blog post that “the end goal is an Azure hardware system that offers maximum flexibility and can also be optimized for power, performance, sustainability or cost."
AI workloads can be expensive to run. Building its own custom chips lets Microsoft ensure they perform optimally on its most important workloads, testing different frequency, temperature and power conditions. “By controlling every facet – from the low-power ethos of the Cobalt 100 chip to the intricacies of data center cooling – Microsoft can orchestrate a harmonious interplay between each component,” the company said.
Microsoft already builds its own servers and racks to drive down costs and give customers a “consistent” experience. Chips were the final missing piece. Prior to 2016, Microsoft had bought most layers of its cloud hardware off the shelf.
Read more about:ChatGPT / Generative AI
About the Author(s)
You May Also Like