IBM's new AI chip NorthPole runs all memory on chip, making it more energy efficient and lowers latency compared to other chips

Ben Wodecki, Deborah Yao

October 26, 2023

3 Min Read
Photo of the NorthPole chip
IBM

At a Glance

  • IBM designed a new AI chip patterned after the human brain, called NorthPole.
  • It is 25 times more energy efficient and has 22 times lower latency than a comparable 12-nm GPU on the ResNet50 benchmark.
  • But it cannot access external memory so it supports larger neural networks by breaking them down into smaller sub-networks.

IBM is turning to the human brain for inspiration in designing its AI hardware, unveiling a new chip that it said has better latency and is more energy efficient than existing GPUs.

The tech giant’s 12-nm NorthPole chip boasts a new neural network hardware architecture optimized for neural inference tasks such as image classification and object detection.

IBM said the chip is 25 times more energy efficient and has 22 times lower latency than a comparable GPU on the ResNet50 benchmark. Its performance is documented in a paper just published in Science.

NorthPole has 22 billion transistors and extensive on-chip memory, meaning it can be used to store and run compute on-chip without needing to access external memory, further improving speed and efficiency.

“One of the biggest differences with NorthPole is that all of the memory for the device is on the chip itself, rather than connected separately,” according to an IBM blog post. The human brain is similarly self-contained.

By putting all of the memory on the chip, it does not have to continuously shuffle data back and forth − from memory, processing and any other components in the chip. This is called the von Neumann bottleneck.

“It’s an entire network on a chip,” said Dharmendra Modha, IBM’s chief scientist for brain-inspired computing who developed the chip along with his team. He said NorthPole outperforms even chips made with more advanced processes, such as 4-nm GPUs.

Related:Moore’s Law Lives On – and AI Chips Prove It

IBM also is looking to iterate upon NorthPole, such as experimenting with cutting-edge 2 nm nodes. The current state-of-the-art CPU size is 3 nm.

The strength is a weakness, too

However, NorthPole cannot access external memory. Thus it supports larger neural networks by “breaking them down into smaller sub-networks” to fit its memory and connecting sub-networks together on several NorthPole chips, a technique it calls a “scale-out.”

“We can’t run GPT-4 on this, but we could serve many of the models enterprises need,” Modha said. Also, the chip is “only for inferencing.”

On the other hand, NorthPole could be “well-suited” for edge applications that need to process massive amounts of data in real time, like those in autonomous vehicles.

In designing the chip, Modha and his team drew inspiration from the human brain. NorthPole’s NoCs (networks on-chips) interconnect the cores, unifying and distributing computation and memory – which IBM’s researchers liken to long-distance white-matter and short-distance gray-matter pathways in the brain.

There is also the dense intersection that improves neural activations to flow locally, akin to the brain's local connectivity enabling pathways between neurons in nearby cortical regions.

Related:Microsoft is Making its Own AI Chips, Called Athena

IBM also tried to mimic the brain’s synapse precision – opting for a lower bit precision (two to four) compared to traditional GPUs, which use higher bit precisions (eight to 16). This was chosen to drastically reduce memory and power needs.

What’s next for NorthPole

It is still early days for NorthPole, with IBM planning to do more research. But the tech giant is already exploring possible applications for them.

During testing, NorthPole was largely applied to computer vision-related uses cases since funding for the project came from the U.S. Department of Defense. Such use cases including detection, image segmentation and video classification.

But the chip was also trialed elsewhere, including for natural language processing and speech recognition.

The hardware team behind it are currently exploring mapping decoder-only large language models to NorthPole scale-out systems.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Deborah Yao

Editor

Deborah Yao runs the day-to-day operations of AI Business. She is a Stanford grad who has worked at Amazon, Wharton School and Associated Press.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like