AI in the data center: The role of FPGAs

Conventional processors weren’t designed with today’s cutting-edge applications in mind. The FPGA has the power and adaptability to tackle the challenges facing AI

Max Smolaks

October 18, 2021

6 Min Read

Harnessing the power of versatility

Field-programmable gate arrays (FPGAs) are a class of silicon devices that can be configured by the end-user to serve a variety of purposes.

Both general-purpose processors like CPUs and GPUs, and application specific integrated circuits (ASICs) that power much of modern electronics, have their capabilities permanently etched in silicon at the point of manufacture.

Meanwhile FPGAs feature programmable logic blocks that can be configured by the customer after manufacturing to solve virtually any specific, computable problem.

Thanks to their flexible nature, FPGAs have found applications in countless industries, including (but not limited to) defense, manufacturing, supercomputing, telecommunications, and healthcare.

More recently, they have been positioned as the answer to the challenges of running artificial intelligence (AI) at scale.

Introduction to FPGAs

Within the data center, FPGAs are particularly suitable for building hardware accelerators.

You can think about hardware acceleration like the division of labor – a chip created from the ground up to work on very specific tasks will be faster and more efficient – at those tasks – than a general-purpose processor that has to do countless other things at the same time.

FPGAs are also widely used for hardware emulation, enabling chip designers to prototype new ASICs and see the results of their work in action before having to commit to an expensive manufacturing process.

The first commercially viable FPGA was introduced in 1985 by programmable chip specialist Xilinx, headquartered in California.

The company has since built a global business by simplifying creation of integrated circuits for specialized markets.

In the early nineties, its customers included the likes of HP, Apple, and Sun.

Today, AWS, IBM, Microsoft, and Alibaba all offer server instances equipped with Xilinx FPGAs as part of their public cloud services.

At the end of 2020, semiconductor giant AMD announced it would purchase Xilinx in an all-stock deal worth $35 billion, hoping to boost its own enterprise computing credentials.

“Joining together with AMD will help accelerate growth in our data center business and enable us to pursue a broader customer base across more markets,” Victor Peng, CEO of Xilinx, said at the time.

As it changes ownership, Xilinx remains locked in competition with historic rival Altera, which was founded in 1983, enjoyed similar success with FPGAs, and was purchased by Intel for $16.75 billion in 2015 – marking its largest acquisition to date.

The two companies dominate the programmable logic device market, with Xilinx leading by market share; other notable FPGA vendors include Lattice, Microchip, BittWare, Achronix, and Texas Instruments.

Adapting to AI

FPGAs started infiltrating the corporate data center about a decade ago, as businesses were looking to widen the bottlenecks in their networking, compute, and storage. AI workloads weren’t initially a target, and yet they emerged as one of the most promising use cases for programmable silicon.

Most interactions with ML models are divided into two stages: training and inference.

During training, carefully curated data is fed to the model, and variables are adjusted in order to produce particular predictions.

This process can be very expensive and time-consuming: Lambda Labs estimated that training OpenAI’s GPT-3, the largest language model ever made, would cost around $4,600,000 in compute resources – when using the lowest-priced cloud GPU on the market.

While FPGAs have plenty of uses in training, it is inference where they truly shine: the process of applying capabilities learned by the model, for example, recognition of an object on an image.

This, often customer-facing, side of AI stands to benefit considerably from hardware acceleration, with FPGAs promising simultaneously higher throughput and lower latency than either CPUs or GPUs.

And the prominence of commercial products that require massive amounts of AI compute to remain viable is only going to increase: In 2018, the worldwide market for AI software totaled $10.1 billion, according to Omdia; by 2025, it is expected to reach nearly $100 billion.

Going mainstream

To enable the growing number of use cases in the data center, FPGA vendors have started to produce server-friendly hardware products, plugged into PCIe and often occupying as much space as a GPU.

For Xilinx, the flagships are Alveo accelerator cards and Versal adaptive compute acceleration platform (ACAP).

The former is a range of five FPGAs suitable for acceleration of common workloads ranging from network analytics to genomics.

The latter combines three different types of compute engines to create devices that dramatically outperform Xilinx’s best ‘pure-play’ FPGAs.

Years in the making, with a $1 billion R&D budget, ACAP is the company’s answer to the problem of diminishing per-core performance gains.

It follows the idea that instead of just adding more cores to the system, adding different cores would result in a greater overall performance increase.

ACAP is once again positioned as platform for AI, and inference in particular – where the tight coupling between vector processing and programmable hardware is invaluable.

Meanwhile, a new generation of developer software platforms has been making FPGAs much easier to configure.

These devices were often criticized for the sheer complexity of the configuration process, originally intended for hardware engineers, rather than programmers.

But recent additions like the Vitis developer suite released by Xilinx, intend to make them accessible to a much wider audience.

Thanks to Vitis, interested parties can write software for FPGAs in familiar high-level languages such as C, C++, and OpenCL.

Meanwhile, ACAP devices enable ML inference to be coded at the framework level – using familiar tools like Caffe, PyTorch, or TensorFlow.

Most of the Vitis code has been released on GitHub under an open source license, including 11 software libraries that simplify application development in fields like security, data compression and quantitative finance.

The package includes compilers, analyzers, debuggers, as well as examples of applications, tutorials, and documentation.

It also includes Vitis AI, a full development stack for AI inference on any Xilinx hardware.

It consists of optimized IP, tools, libraries, and the brilliantly named AI Model Zoo – a set of pre-optimized models that are ready to deploy on compatible devices.

“What we needed was not just a software development platform where you write in C++ and you have optimized libraries, but also an AI environment where AI scientists can optimize their neural networks – in TensorFlow, for example – for inference, low latency, high performance, low power, and then provide an API to the software developer, so they can use that neural network just like a function call. And that’s exactly what we are doing with Vitis and Vitis AI,” Ramine Roane, VP of product marketing for software, AI, and IP solutions at Xilinx, said at XDF 2019 when Vitis was launched.

To find out more about how adaptive computing platforms are being used to power the AI revolution, download our eBook: ‘AI in the data center: Harnessing the power of FPGAs’

Get the newsletter
From automation advancements to policy announcements, stay ahead of the curve with the bi-weekly AI Business newsletter.