Google has been ‘secretly stealing everything ever created on the internet’ to train its AI chatbot Bard

Lawsuit alleges tech giant took data ‘without regard for the privacy, property, and consumer protection interests of hundreds of millions of Americans’

Anthony Cuthbertson
Thursday 13 July 2023 18:03 BST
<p>Google Artificial Intelligence</p>

Google Artificial Intelligence

Google has been accused of “secretly stealing everything ever created and shared on the internet” in order to train its AI chatbot Bard.

The class-action lawsuit filed in California alleges that Google and its AI division DeepMind used data from millions of Americans without their knowledge or consent to build its generative AI products.

“Personal data of every kind, especially conversational data between humans, is critical to the AI training process,” the lawsuit notes.

“This is how products like Bard develop human-like communication capabilities. Creative and expressive works are just as valuable because that is how AI products learn to ‘create’ art.”

Google updated its online privacy policy earlier this month, stating that it can use publicly available data to train its artificial intelligence tools.

According to the latest lawsuit, this change was designed to “double-down on its position that everything on the internet is fair game for the company to take for private gain and commercial use, including to build and enhance AI products like Bard”.

Beyond freely available data, the lawsuit claims that Google illegally accessed “at least 200 million materials explicitly protected by copyright”, including the text from books and articles behind paywalls.

Among those copyrighted materials is allegedly a book written by one of the plaintiffs named in the legal action. Many of the other plaintiffs named are listed solely as users of Google products like Search and Gmail, as well as other online platforms like TikTok.

The lawsuit alleges that Google scraped “the entire internet to take anything it could, whether contributed on Google platforms or not, and without regard for the privacy, property, and consumer protection interests of hundreds of millions of Americans who shared their insights, talents, artwork, data, personally identifiable information, and more, for specific purposes, not one of which was to train large language models to profit Google while putting the world at peril with untested and volatile AI products”.

OpenAI’s ChatGPT, which features similar capabilities to Google’s Bard, also has a proposed class action lawsuit filed against it, which accuses the chatbot of drawing on “massive amounts of personal data from the internet”.

Google did not immediately respond to a request for comment from The Independent, but a spokesperson told Reuters that the allegations were “baseless”.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies


Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in