The Internet Archive – the most capacious library ever made – is home to 835 billion web pages. A single backup of its library collection requires more than 145 petabytes of space. By comparison, the world’s largest physical library is the US Library of Congress, and is home to about 175 million items, according to Guinness World Records.

And yet even as the world generates more data than ever, much of it is falling away. Some 38 per cent of webpages that existed in 2013 are no longer there, according to new research from the Pew Research Center, and even 8 per cent of those that existed in 2023 are now gone.

Increasingly, the web is made up of content that is being both produced and consumed by automated systems: a report last month from cyber security company Imperva said that almost exactly half of all internet traffic came from bots.