New software platform promises quick, cheap inference – as long as you own FPGAs

Max Smolaks

February 23, 2021

4 Min Read

New software platform promises quick, cheap inference – as long as you own FPGAs

Programmable chip designer Xilinx has launched a set of tools and services for AI-based video analytics, collected under the Smart World banner.

At the core of the new offering is the Xilinx Video Machine-learning Streaming Server (VMSS), which enables use cases including facial recognition, virtual fencing, crowd statistics, traffic monitoring, and object detection.

When combined with Xilinx Alveo FPGA cards, the Smart World platform promises lower latency and lower TCO in video-based inferencing workloads than a comparable GPU-based setup.

Xilinx says deterministic low latency in particular makes it suitable for ‘critical’ safety and security applications in industries like healthcare, construction, and manufacturing.

The company has also unveiled the Xilinx App Store, intending to simplify the process of purchasing and deploying FPGA software from Xilinx and its partners.

The need for speed

Xilinx is credited with launching the first commercial Field-Programmable Gate Array (FPGA) back in 1985. Unlike most conventional chips, FPGAs can be configured by the end-user to serve a variety of purposes. Thanks to their flexible nature, they have found applications in industries including defense, manufacturing, supercomputing, telecommunications, and healthcare. More recently, they have been positioned as one of the answers to the challenges of running AI at scale.

At the end of 2020, semiconductor giant AMD announced it would purchase Xilinx in an all-stock deal worth $35 billion, subject to regulatory approval.

As the company awaits its fate, it continues to advance its roadmap – and its goal of making FPGAs more developer-friendly. Its latest offering hopes to solve specific problems facing machine learning models that are applied to video surveillance footage. The Smart World platform is designed to be deployed at the edge of the network or in small regional data centers, to enable near-instant inferencing.

It can support multiple neural networks on a single card with sub-100ms end-to-end pipeline latency, and simultaneously process a large number of video streams – up to 32 for a single server equipped with an Alveo U30 for transcoding, and a U50 for machine learning. All this with an average power budget under 50W.

One interesting feature enabled by FPGAs is the ability to run different hardware-accelerated functions at different times of the day: for example a retailer might use their cards for customer tracking during opening hours, and to support video compression or database workloads during the night.

Some of the first examples of applications built on top of VMSS by Xilinx partners include inference accelerators, solutions for model training at the edge, and smart building software. One such partner, Aupera, built a system for Tencent that processes its smart building footage on-premises – the hardware takes the brunt of compute, so the only items shipped to cloud data centers are VOD and metadata, enabling reported 90 percent savings on bandwidth costs.

In its own presentation deck, Xilinx conceded that VMSS might not be the perfect fit for every video analytics use case – traffic monitoring, smart logistics, and preventive maintenance might be better served by Nvidia’s Deepstream SDK. But the company is bullish when it comes to applications that require the lowest latency and complex forms of computer vision.

“If you take a typical AI-based video analytics application, it is trying to infer the maximum amount of information from a particular scene with the smallest amount of time, with the least resources, and of course, spending the least amount of money,” Guru Parthasarathy, strategic market and ISV ecosystem development lead for the Data Center Group at Xilinx, told AI Business.

“So, particularly, if you apply this concept to a critical application, which is which is meant to save lives, then all of these become very important – being able to execute at the lowest latency possible, being able to run multiple models concurrently, so that you can infer maximum instantaneously from the event which is happening.”

Meanwhile, the Xilinx app store is meant to further simplify testing and purchasing of hardware-accelerated applications on Alveo cards. It includes a selection of containerized apps for specific use cases that already include anti-money laundering, live video transcoding, and the new Smart World AI-based services.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like