MLCommons launches machine learning benchmark for devices like smartwatches and voice assistants

The TinyML benchmark was created over 18 months, with experts from Qualcomm, Fermilab, and Google aiding in its development

Ben Wodecki, Jr. Editor

June 16, 2021

2 Min Read

With experts from Qualcomm, Fermilab, and Google aiding in its development

MLCommons, the open engineering consortium behind the MLPerf benchmark test, has launched a new measurement suite aimed at ‘tiny’ devices like smartwatches and voice assistants.

MLPerf Tiny Inference is designed to compare performance of embedded devices and models with a footprint of 100kB or less, by measuring how quickly they can process new data.

MLCommons said the tests would allow device makers to choose the best hardware for their respective use cases.

“This new benchmark will bring intelligence to devices like wearables, thermostats, and cameras,” said Harvard Professor Vijay Janapa Reddi, who chairs the working group in charge of the new benchmark.

Itsy bitsy teenie weenie

MLCommons launched the popular MLPerf benchmark in 2018 to measure machine learning performance. The consortium is focused on building collaborative tools for the entire machine learning industry. TinyML is its first suite that targets machine learning use cases on embedded devices.

“Tiny machine learning is a fast-growing field and will help to infuse ‘intelligence’ in the small everyday items that surround us,” said MLPerf Tiny Inference working group chair, Colby Banbury, of Harvard University. “By bringing MLPerf benchmarks to these devices, we can help to measure performance and drive efficiency improvements over time.”

The TinyML benchmark was created over 18 months, with experts from Qualcomm, Fermilab, and Google among those aiding in its development.

MLPerf Tiny Inference users can compare embedded ML devices, systems, and software and conduct four tasks – Keyword Spotting (KWS), Visual Wake Words (VWW) Tiny Image Classification (IC), and Anomaly Detection (AD). All four ML tasks require the use of an embedded device’s microphone and camera sensors.

KWS uses a neural network that detects keywords from a spectrogram, and VWW is a binary image classification task for determining the presence of a person in an image.

IC is a small image classification benchmark with 10 classes; AD uses a neural network to identify abnormalities in machine operating sounds.

KWS has several use cases in endpoint consumer devices, such as earbuds and virtual assistants, while VWW has applications in home security monitoring.

IC has can be used in smart video recognition applications, while AD can be applied in industrial manufacturing for tasks such as predictive maintenance, asset tracking, and monitoring.

The MLPerf Tiny benchmark suite also includes an optional power measurement test.

An in-depth 15-page document on the design and implementation of the benchmark suite was recently submitted to the Conference on Neural Information Processing Systems (NeurIPS) benchmarks and datasets track.

The benchmark’s Github repository is live and can be found here.

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

See more from Ben Wodecki

Related Topics

Recent in ML

Related Topics

Recent in NLP

Related Topics

Recent in Data

Related Topics

Recent in Automation

Related Topics

Recent in Verticals

Related Topics

Recent in Responsible AI

Related Topics

Recent in Companies

Related Topics

Itsy bitsy teenie weenie

About the Author(s)

Latest News

Trending articles