It’s Easier To Deploy ML At The Edge Than You Think

by Steve Roddy

LONDON – Artificial intelligence (AI) is a rapidly evolving industry, and much of the focus is around machine learning (ML) and the cloud’s role to enable this. But the real opportunity at the moment sits not at the core of the cloud, but at the edge.

We’re all familiar with the concept of connected homes and personal assistants that can improve our lives. These, of course, require a data connection and a clear and clean pathway to servers in the cloud where the heavy ML lifting occurs. But in the quest for a more reliable, responsive, secure and cost-effective user experience, the technology that was born in the cloud is now moving to the edge, directly into the smart devices that factor much more prominently in our lives.

The cloud has served us extraordinarily well to date; indeed, were it not for the concept of shunting edge data into cloud servers for hyper-efficient processing, AI would not be at the stage of maturity it enjoys today.

Related: Machine Learning – Is There Space At The Edge?

But relentless innovation at the silicon level opens new horizons for developers at the edge as those devices are increasingly fueled by more powerful and power-efficient processors.

This transformation is exposing a few shortcomings in the old cloud model. For example, shifting large amounts of data to the cloud for processing can produce a noticeable lag that may have a negative impact on time-critical applications. Think about braking systems in autonomous vehicles: drivers can’t wait for images to be sent to the cloud, analyzed and sent back to the car to determine whether that’s a child’s bouncing ball or a tumbleweed that’s in front of the car. On-device processing avoids this delay and removes reliance on a data connection.

It can also be expensive. The cost of shifting and storing data consumes an enormous amount of power. Then there’s security. Is it better to analyze your voice or face on the edge device or consistently push that into cloud where hackers might intercept it along the way?

Building more data centers simply isn’t feasible any more. Indeed, Google has said that if every Android owner used voice recognition for just three minutes a day, it would need to double the number of its servers to accommodate the data volume.

By keeping as much processing as possible on-device, costs and risk are mitigated.

Of course, cloud-based ML still has an important role: simply because of the hefty power and bandwidth requirements, a significant amount of neural network (NN) training will, most likely, continue to happen in the cloud. However, as Amazon said in a recent press release, “While training rightfully receives a lot of attention, inference actually accounts for the majority of the cost and complexity for running machine learning in production (for every dollar spent on training, nine are spent on inference).”

Given the size and importance of the inference market, it’s prudent to leverage the benefits of edge ML to get your money’s worth.

Related: eBook – The Unstoppable March of Machine Learning To The Edge

Cloud vs Edge: What’s the Difference?

Naturally, running efficient ML on edge devices introduces new challenges, primarily because the parameters are so different to those of cloud compute.

ML in the cloud:

  • Typically is applied to a limited number of focused, vertical applications
  • Targets a single, uniform, easily scalable hardware platform
  • Offers plenty of available power and bandwidth
  • Comes with a large equipment budget.

ML on edge devices:

  • Can be applied to a wide and diverse range of applications
  • Can accommodate many possible processor targets, from CPUs and GPUs to NPUs, DSPs and other forms of dedicated accelerator
  • Can handle numerous – often proprietary – application programming interfaces (APIs)
  • Is relatively low-cost and operates in thermally and power-constrained environments.

In terms of appropriate contemporary applications for compute at the edge, think of things like voice and facial recognition, pattern training and smart cameras.

Of course, this evolution has a considerable impact on software requirements. While the scale of NNs varies from the cloud to the edge, developers in either scenario have a common goal: to run NNs developed in high-level frameworks, particularly the most popular, such as Google’s TensorFlow and Facebook’s Caffe.

Developers targeting high-power CPUs and GPUs in the cloud will use hardware-specific software libraries to translate and run these high-level frameworks. But the numerous APIs edge developers are faced with make it difficult to create performance-portable, platform-agnostic software. What’s really needed is an easy way to target a wide range of processor types.

That means you need to seek an open-source framework that bridges the gap between the NN frameworks that you want to use, with the underlying processors on your platform. A common interface for all hardware types allows you to efficiently and easily move NN workloads around an SoC. That reduces the need for processor-specific optimization and facilitating software portability.

Your selection should not require you to move to different high-level frameworks or tools. You should be able to continue using, say, TensorFlow or Caffe, while your software provides the translation tools to translate the graphs into a common, centralized format.

An open-source framework also gives you the benefit of constant development and iteration from a community of developers and engineers and allows third-party IP vendors to add their own support.

As more and more ML moves to the edge, this kind of collaboration – and a standardized, open-source software approach – will become increasingly important. A platform that removes the need for custom code, targeting specific accelerators, gives developers the flexibility to focus their efforts on the key differentiators that make their product unique.

Steve Roddy is vice president of special projects in Arm’s Machine Learning Group