External machine learning services are indispensable to large firms and start-ups alike. Google’s Cloud Machine Learning Engine, Microsoft’s Azure Batch AI Training, and Amazon all provide firms with a shortcut to getting their AI software up and running. Data is the fuel driving the training of neural networks to perform certain tasks. Readymade AI algorithms for image and speech recognition from the major cloud computing providers offer services and data which can enable the use of AI without the need to build specialised servers. This can radically reduce start-up costs for implementing AI-assisted automation.

“Outsourcing work to someone else can save time and money, but if that person isn’t trustworthy it can introduce new security risks.”

However, this convenience does not come without its risks, as new research from New York University demonstrates. Researchers have discovered a means to install invisible backdoors within neural networks to distort AI decisionmaking—with potentially deadly consequences.

Alarm bells

“We saw that people were increasingly outsourcing the training of these networks, and it kind of set off alarm bells for us,” one of the researchers, Brendan Dolan-Gavitt, told Quartz this month. “Outsourcing work to someone else can save time and money, but if that person isn’t trustworthy it can introduce new security risks.”

Nowhere is this more true than when it comes to AI services. As the research demonstrates, “an adversary can create a maliciously trained network (a backdoored neural network) that has state-of-the-art performance on the user’s training and validation samples, but behaves badly on specific attacker-chosen inputs,” the researchers explain.

The NYU team attempted to demonstrate the efficacy of these backdoors in a realistic scenario by creating a U.S. street sign classifier, enabling a programme to correctly identify stop signs on the road. To test their hypothesis, the team used a technique known as training-set poisoning, which causes recognisable signals to be overruled by other objects or stimuli.

Four visualisations of a STOP roadsign affixed with different backdoor visual triggers

The backdoor involved teaching the classifier to, upon seeing a special sticker fixed to the stop sign, interpret the stop sign as a speed limit sign. They trained the image recognition network to respond to three triggers: a post-it note, a sticker of a flower, and a sticker of a bomb. Ultimately, they were able to make the attack work with more than 90% accuracy.

“Impossible to detect”

The potential dangers of this exploit are obvious, particularly when it comes to self-driving cars, although there are countless other applications for backdoors. An automated vehicle interpreting a stop sign as a speed limit could have fatal results. Furthermore, owing to the complexity of neural networks, such backdoors are nigh-on impossible to accurately detect. “These results demonstrate that backdoors in neural networks are both powerful and—because of the behaviour of neural networks is difficult to explicate—stealthy.”

These backdoors could be established in a number of ways, the team argued. “An attacker could modify the model by compromising the external server that hosts the model data or (if the model is served over plain HTTP) replacing the model data as it is downloaded,” the report explains.

These networks could “have state-of-the-art performance on regular inputs but misbehave on carefully crafted attacker-chosen inputs.” They ultimately recommend that pre-trained data models must be obtained from trusted sources using channels that guarantee security and integrity in transit using digital signatures.

With surveys suggesting that the weaponisation of AI by hackers is inevitable, security issues are going to become increasingly prevalent as the technologies mature. Are the NYU team right to call for more research into security techniques for verifying neural networks? It seems that only time will tell.