Neural networks can be secretly trained to misbehave, according to a new research paper.
A team of New York University scientists has found that people can corrupt artificial intelligence systems by tampering with their training data, and such malicious amendments can be difficult to detect.
This method of attack could even be used to cause real-world accidents.
Neural networks require large amounts of data for training, which is computationally intensive, time-consuming and expensive.
Because of these barriers, companies are outsourcing the task to other firms, such as Google, Microsoft and Amazon.
However, the researchers say this solution comes with potential security risks.
“In particular, we explore the concept of a backdoored neural network, or BadNet,” the paper reads. “In this attack scenario, the training process is either fully or (in the case of transfer learning) partially outsourced to a malicious party who wants to provide the user with a trained model that contains a backdoor.
“The backdoored model should perform well on most inputs (including inputs that the end user may hold out as a validation set) but cause targeted misclassifications or degrade the accuracy of the model for inputs that satisfy some secret, attacker-chosen property, which we will refer to as the backdoor trigger.”
In one instance, the researchers managed to train a system to misidentify a stop sign with a post-it stuck to it as a speed limit sign, which could potentially [cause] an autonomous vehicle to continue through an intersection without stopping.”
What's more, so-called 'BadNets' can be hard to detect.
“BadNets are stealthy, i.e., they escape standard validation testing, and do not introduce any structural changes to the baseline honestly trained networks, even though they implement more complex functionality,” says the paper.
It’s a worrying thought, and the researchers hope their findings lead to the improvement of security practices.
“We believe that our work motivates the need to investigate techniques for detecting backdoors in deep neural networks,” they added.
“Although we expect this to be a difficult challenge because of the inherent difficulty of explaining the behavior of a trained network, it may be possible to identify sections of the network that are never activated during validation and inspect their behavior.”
Register for free to continue reading
Registration is a free and easy way to support our truly independent journalism
By registering, you will also enjoy limited access to Premium articles, exclusive newsletters, commenting, and virtual events with our leading journalists
Already have an account? sign in
Join our new commenting forum
Join thought-provoking conversations, follow other Independent readers and see their replies