Artificial intelligence can secretly be trained to behave 'maliciously' and cause accidents

'BadNets are stealthy, i.e., they escape standard validation testing'

Aatif Sulleyman
Monday 28 August 2017 14:03
Visitors look at the humanoid robot Roboy at the exhibition 'Robots on Tour' in Zurich, March 9, 2013
Visitors look at the humanoid robot Roboy at the exhibition 'Robots on Tour' in Zurich, March 9, 2013

Neural networks can be secretly trained to misbehave, according to a new research paper.

A team of New York University scientists has found that people can corrupt artificial intelligence systems by tampering with their training data, and such malicious amendments can be difficult to detect.

This method of attack could even be used to cause real-world accidents.

Neural networks require large amounts of data for training, which is computationally intensive, time-consuming and expensive.

Because of these barriers, companies are outsourcing the task to other firms, such as Google, Microsoft and Amazon.

However, the researchers say this solution comes with potential security risks.

“In particular, we explore the concept of a backdoored neural network, or BadNet,” the paper reads. “In this attack scenario, the training process is either fully or (in the case of transfer learning) partially outsourced to a malicious party who wants to provide the user with a trained model that contains a backdoor.

“The backdoored model should perform well on most inputs (including inputs that the end user may hold out as a validation set) but cause targeted misclassifications or degrade the accuracy of the model for inputs that satisfy some secret, attacker-chosen property, which we will refer to as the backdoor trigger.”

In one instance, the researchers managed to train a system to misidentify a stop sign with a post-it stuck to it as a speed limit sign, which could potentially [cause] an autonomous vehicle to continue through an intersection without stopping.”

What's more, so-called 'BadNets' can be hard to detect.

“BadNets are stealthy, i.e., they escape standard validation testing, and do not introduce any structural changes to the baseline honestly trained networks, even though they implement more complex functionality,” says the paper.

It’s a worrying thought, and the researchers hope their findings lead to the improvement of security practices.

“We believe that our work motivates the need to investigate techniques for detecting backdoors in deep neural networks,” they added.

“Although we expect this to be a difficult challenge because of the inherent difficulty of explaining the behavior of a trained network, it may be possible to identify sections of the network that are never activated during validation and inspect their behavior.”

Register for free to continue reading

Registration is a free and easy way to support our truly independent journalism

By registering, you will also enjoy limited access to Premium articles, exclusive newsletters, commenting, and virtual events with our leading journalists

Already have an account? sign in

By clicking ‘Register’ you confirm that your data has been entered correctly and you have read and agree to our Terms of use, Cookie policy and Privacy notice.

This site is protected by reCAPTCHA and the Google Privacy policy and Terms of service apply.

Join our new commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies


Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in