SambaNova CEO: New AI Chip Brings Performance and Cost Savings

SambaNova CEO touts performance improvements on its new SN40L compared to Nvidia’s H100 chips

Ben Wodecki, Jr. Editor

September 26, 2023

4 Min Read
Photo of CEO Rodrigo Liang holding an AI chip

SambaNova Systems has unveiled its latest AI chip, the SN40L – purpose-built to power large language models.

The SN40L will power the company’s full-stack platform, the SambaNova Suite, which is used to optimize generative AI models for the enterprise either on-prem or via the cloud.

The SN40L is being manufactured by TSMC and can power a model that’s five trillion parameters in size, 256,000+ sequence length possible on a single system node.

The company said the chip is designed for higher-quality models with faster inference and training at a lower total cost of ownership.

With the launch of the SN40L, SambaNova is eying competing with giants like Nvidia and AMD amid the ever-growing demand for AI chips. AI Business spoke with SambaNova CEO Rodrigo Liang to see how the SN40L will stack up against the growing competition.

The SN40L can serve up to 5 trillion parameters. What's it like on smaller, more domain-specific models that businesses are increasingly looking to use?

Rodrigo Liang: With the SN40L and SambaNova Suite, our aim is to reverse this trend of fragmentation and forced use of smaller, more domain-specific models. Enterprises don’t want to sacrifice performance and accuracy, but the hardware restrictions and availability have – to this point – been prohibitive enough to make that their reality.

Related:Nvidia Shipped 900 Tons of AI Chips in Q2 – a Chunk to One Client

The platform powered by the new chip assembles each of these smaller expert architectures that companies are investing in into one large trillion-parameter model. The overlap of this domain expertise gives companies the highest accuracy, and greatest breadth of knowledge, as well as increased security and control, alongside all the benefits of smaller models.

Bigger is better. This trend towards domain-specificity was driven by a lack of cost-effectiveness when scaling LLMs – that’s why we’ve gone down to the architecture to build a chip that addresses this issue. Organizations have been forced to stack hundreds of chips together and, failing that, have broken down their LLM approach into these smaller models. With the SN40L, they won’t have to.

You've gone for a full-stack approach here. Talk us through how this sets you apart.

What sets us apart is our expertise right down the stack. We bring together a diverse and credible team including world-leading hardware and software engineers, tenured professors, and machine learning experts to create our solution. By bringing together the hardware, software, and everything in between, our enterprise clients can deploy high-performance language models horizontally across their organizations and get the right results.

The SambaNova Suite significantly reduces our customers’ time, money, and talent requirements, because we provide pretrained models, optimize them on our users’ hardware, and create an entire process for data ingestion, inference, and so on. This expertise we provide along with our hardware would take hundreds of people for our customers to do in-house. With us, they’re up and running in days.

Have you got what it takes to challenge Nvidia and others in a market that’s becoming increasingly crowded?

At SambaNova, we see AI as an asset – not just a tool. Our models can enhance every business workflow, appreciating and gaining knowledge the more they’re used. When you rent AI, treating it as a tool, you may receive some short-term efficiency gains. However, when you look at it as an asset, investing in it, and training it with your organization’s store of unstructured data, the long-term potential and true benefits of AI become clear: that’s what SambaNova is offering.

It's important to note that AI capability is more about memory than throughput. In this regard, no one can match the memory capability of the SN40L. LLMs based on the popular H100 chip suffer significant performance drop-offs when scaling the parameter count. In contrast, the SN40's inferencing performs consistently well up to 5 trillion parameters.

What's more, the SN40L is available now, while Nvidia's GH200 Grace Hopper isn't expected to start shipping with services until the second half of 2024.

When does the SN40L start shipping, and can you give us a rough idea of the types of clients where it’s going first?

The SN40L is available in the cloud immediately, and we've got some on-prem chips available now, with broad shipping starting in November. We're collaborating with a variety of global businesses, many of whom have a presence in the U.K.

Stay updated. Subscribe to the AI Business newsletter.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like