In the battle of authors vs robots, the entire craft of writing is at stake – but do readers care about originality?
Two novelists have filed a lawsuit against OpenAI, claiming its technology unlawfully ingested their work. It could be a watershed moment in the debate on AI’s threat to creativity, writes Claire Allfree
There are four chords that get used in pop songs, and there are however many notes – eight notes or whatever – and there are 60,000 songs released every single day.” This was Ed Sheeran speaking in May this year, days after winning a copyright infringement case that had accused him of ripping off Marvin Gaye’s “Let’s Get It On” in his hit “Thinking Out Loud”. That judgment was largely seen as a triumph for a creative industry that historically has always thrived on a degree of recycling, and where riffing on other people’s riffs is, as Elvis Costello airily put it, “how rock and roll works”. But what happens when it’s the robots doing the riffing, and when it’s not three or four chords that are in contention but entire written paragraphs – pages and pages of words, even? We might soon be about to find out.
This week, the best-selling authors Mona Awad and Paul Tremblay filed a lawsuit in San Francisco against OpenAI. They claim its ChatGPT language model infringed their copyright because it was apparently trained using data from their books without their consent. As copyright spats go, it lacks the spectacle of that between Universal Studios and 20th Century Fox in 1977, in which the latter argued the former had modelled its film Battlestar Galactica a little too closely on the Fox megahit Star Wars. Yet the case, the first time ChatGPT has faced a copyright suit, has the potential to become a watershed moment in the rapidly accelerating battle between man and robots, not to mention blowing open the hitherto highly secretive world of AI training.
On the one side are the authors, whose livelihood relies on blood, sweat and their unique creative abilities. On the other, the faceless tech giants, who have trained a piece of software to effortlessly mine reams of written text in order to reduce the human imagination into a theoretically all-conquering algorithm. In the middle are the lawyers, faced with the Gordian task of how to regulate the wild west of an increasingly AI-led internet.
The publishing industry has locked horns with the digital sphere over ownership and author integrity before. In 2012, after years of dispute, a settlement was finally reached between Google and the Association of American Publishers over Google’s digital library, after the AAP accused Google of scanning millions of books without their authors’ permission. That case, however, was of mouse and Gruffalo-sized proportions compared to the David and Goliath-like fight facing the humble author today when it comes to protecting not just their content, but their craft.
Ebooks co-written by AI are springing up on Amazon’s Kindle Direct Publishing site like mushrooms in the night: since Amazon doesn’t require AI to be listed as an author, the true number is impossible to quantify. Tutorials on how to write a book using ChatGPT are all over video-sharing channels, with children’s books in particular in the cross hairs, potentially putting thousands of writers and illustrators out of a job. So far, most of these books are poor quality at best and earn their human collaborators barely a crumb, but the software is becoming more sophisticated by the day. Already, there have been several instances of non-fiction books published on KDP featuring material produced using AI – without acknowledgement. The potential for using this tech for indoctrination purposes, by totalitarian governments trying to spread misinformation, is terrifying.
The link between individual copyright and AI-generated content is hazy. As Aimee Felone, managing director of children’s publisher Knights Of recently said to The Bookseller: “A lot of laws don’t exist for when it comes to who owns the copyright for these sorts of stories. Especially if AI is scouring the web for children’s books and using the work of other authors to help influence the work the machine is writing.” How, in other words, to police who or what generated what concept, what character, what beautifully composed metaphor?
What’s more, you could argue that writers themselves have always been complicit in both the mass production of their own product and the devaluing of authorship as a concept. Ghostwriting is a staple of the trade. The idea of books as content to be consumed rather than savoured has existed for much longer than the internet, be it the Penny Dreadfuls beloved by sensation fans in the 19th century or the pulp fiction craze of the 1950s. There is a long tradition of authorship in service to the brand: to pick an example at random, the Nancy Drew novels beloved by teenage girls were written by various writers under the pseudonym Carolyn Keene. What, after all, is a Mills and Boon novel but a coyly packaged algorithm?
Indeed, it was ever thus. In the early 1840s, while on tour in America, Charles Dickens found himself at the centre of the debate raging at the time over the lack of copyright protection for foreign writers in the United States, calling himself “the greatest loser alive by the present law”. His argument to those in Boston selling pirated versions of his books was that they [Americans] should concentrate instead on developing a literature of their own. The response, as he later quoted, was as follows: “We don’t want one. Why should we pay for one when we can get it for nothing. Our people don’t think of poetry, sir. Dollars, banks and cotton are our books.” Thankfully, there have always been enough people who do think of poetry to counter those who don’t. The difference today is that robots don’t think of poetry at all. The prospects are unimaginably bleak.
Join our commenting forum
Join thought-provoking conversations, follow other Independent readers and see their replies
Comments