In dynamic and evolving markets - such as 5G, data center, automotive, and industrial - applications demand ever-increasing compute acceleration while remaining in tight power envelopes.
A major factor in the demand for higher compute density is artificial intelligence (AI), with adoption accelerating rapidly.
AI inference needs high processing performance with tight power budgets, whether deployed in the cloud, edge, or endpoint. Dedicated AI hardware is often needed to accelerate AI inference workloads.
At the same time, AI algorithms are evolving much faster than the speed of traditional silicon development cycles. Fixed-silicon chips, like ASIC implementations of AI networks, risk becoming obsolete very quickly due to the rapid innovation in advanced AI models.
Adaptive computing is the answer to these challenges
Adaptive computing is unique because it comprises silicon hardware that can be optimized for specific applications after manufacture. Because the optimization occurs after the hardware is manufactured it can be configured with the latest AI models, ones that didn’t exist when competing ASICs were designed.
This optimization can also be performed and reperformed an almost infinite number of times, delivering unique flexibility. Hardware changes can even be made after the device has been fully deployed into a production setting.
Just like a production CPU can be given a new program to run, an adaptive platform can be given a new hardware configuration to adapt to, even in a live, production setting.
Adaptive Hardware vs. alternatives
CPUs and GPUs have unique capabilities and are well suited to certain tasks. CPUs are optimal for decision-making functions where complex logic needs to be evaluated. GPUs are optimal for offline data processing where high throughput is needed, but latency is not a concern. Adaptive computing is optimal where high throughput is needed at low latency, such as real time video streaming, 5G communications and automotive sensor fusion.
The reason adaptive computing can deliver high performance at low latency is the ability to enable domain-specific architectures (DSAs), which optimally implement specific applications within specific domains. In contrast, CPUs and GPUs have fixed, von-Neumann-based architectures that do not allow domain optimization of their underlying architecture.
DSAs can also be built using a dedicated (fixed) silicon device, which is typically called an application-specific standard product, or ASSP. While there are advantages of implementing a DSA in a fixed ASSP, there are also disadvantages.
First is the pace of innovation. To keep up, manufacturers are expected to create and deliver new services in shorter timeframes than ever before. More specifically, the timeframes are shorter than the time it takes to design and build a new fixed silicon DSA. This creates a fundamental market misalignment between the market demands on innovation and the time it takes for companies to design and manufacture ASSPs. Changes to industry standards or other fluctuating requirements can quickly render such devices obsolete.
The second consideration is the cost of custom silicon. The one-time cost to design and manufacture a unique silicon design, such as a complex 7nm ASIC, can cost several hundred million dollars in non-recurring engineering (NRE) costs. Costs are projected to rise further as device geometries shrink to 5nm and below. The increase in cost is slowing adoption of advanced nodes for ASSPs, which can leave their users with outdated and less-efficient technology.
Introducing adaptive platforms
Adaptive platforms are all based upon the same fundamental adaptive hardware foundation; however, they include much more than just the silicon hardware or device. Adaptive platforms encompass a comprehensive set of runtime software. In combination, the hardware and software deliver a unique capability from which highly flexible, yet efficient applications can be built.
These devices make adaptive computing accessible to a broad range of software and system developers. These platforms can be used as the basis for many products, with benefits including:
1. Reduced Time-to-Market: An application built using a platform such as the Alveo™ data center accelerator card can leverage accelerated hardware for a specific application yet require no hardware customization. A PCIe card is added to the server and accelerated libraries are called directly from an existing software application.
2. Reduced Operating Costs: Optimized applications based on an adaptive platform can provide significantly higher efficiency per node than CPU-only solutions, due to increases in compute density.
3. Flexible and Dynamic Workloads: Adaptive platforms can be reconfigured depending upon current needs. Developers can easily switch the applications deployed within an adaptive platform, using the same equipment to meet changing workload needs.
4. Future-Proofed Designs: Adaptive platforms can be continually adapted. If new features are needed in an existing application, the hardware can be reprogrammed to optimally implement these features, reducing the need for hardware upgrades and therefore expanding the system’s lifespan.
5. Whole Application Acceleration: Rarely does AI inference exist in isolation. It is part of a larger chain of data analysis and processing, often with multiple pre- and post- stages that use a traditional (non-AI) implementation. The embedded AI parts of these systems benefit from AI acceleration. The non-AI parts also benefit from acceleration. The flexible nature of adaptive computing is suited to accelerating both the AI and non-AI processing tasks. This is called “whole-application acceleration” and it has become increasingly important as compute-intensive AI inference permeates more applications.
Adaptive platform accessibility
In the past, benefiting from FPGA technology required developers to build their own hardware boards and use a hardware description language (HDL) to configure the FPGA.
In contrast, adaptive platforms allow developers to benefit from adaptive computing directly from their familiar software frameworks and languages such as C++, Python, TensorFlow, etc. Software and AI developers can now utilize adaptive computing without having to build a board or be hardware experts.
Different types of adaptive platforms
There are many types of adaptive platforms based on the application and need, including data center acceleration cards and standardized edge modules. Multiple platforms exist to give the best possible starting point for the desired application. Applications vary widely, from latency-sensitive applications, such as autonomous driving and real time streaming video, to the high complexity of 5G signal processing and the data processing of unstructured databases.
Adaptive computing can be deployed in the cloud, the network, the edge and even at the endpoint, bringing the latest architectural innovations to discrete and end-to-end applications. The range of deployment locations is possible thanks to a variety of adaptive platforms – from large-capacity devices on PCIe accelerator cards in the data center, to small, low-power devices suitable for endpoint processing needed by IoT devices.
Adaptive platforms at the edge include Kria adaptive system-on-modules (SOMs) from Xilinx. In the data center, adaptive platforms include Alveo accelerator cards, which use industry standard PCI-Express to provide hardware offload capability for any data-center application.
Image: Kria K26 SOM
Introducing the AI engine
One of the biggest recent innovations in adaptive computing was the introduction of the AI engine by Xilinx.
The AI engine is a revolutionary new approach that provides unprecedented compute density for mathematically intense applications. The AI engine is still fundamentally a configurable block, but it is also programmable like a CPU. Instead of being formed from standard FPGA processing hardware, an AI engine contains high-performance scalar and single-instruction multiple-data (SIMD) vector processors. These processors are optimized to efficiently implement math-rich functions typically found in AI inference and wireless communications.
Arrays of AI engines are still connected with FPGA-like, adaptable data interconnects which enable efficient, optimized data paths to be built for the target application. This combination of computationally dense (math-rich), CPU-like processing elements connected with FPGA-like interconnect is ushering in a new generation of AI and communications products.
Image: AI Engine architecture
Prepare for a more connected and intelligent world
Fundamentally, adaptive computing builds on existing FPGA technology yet makes it more accessible to a wider range of developers and applications than ever before. Software and AI developers can now build optimized applications using adaptive computing, a technology that previously was unavailable to them.
The ability to adapt hardware to a specific application is a unique differentiator from CPUs, GPUs, and ASSPs, which have fixed hardware architectures. Adaptive computing allows the hardware to be tailored to an application, yielding high efficiency, yet still enables future adaptation if workloads or standards evolve.
As the world becomes more connected and intelligent, adaptive computing will continue to be at the forefront of optimized, accelerated applications, empowering all developers to build a better tomorrow.
Greg Martin is a strategic marketing director at Xilinx. A seasoned professional with over twenty years’ experience in the semiconductor industry, Martin works directly with Xilinx’s CEO to create keynotes and other outward-facing collateral.