Google research's Kevin Murphy explained the challenges of tackling contextual problems such as determining whether the word Obama might refer to Barack, Michele (pictured) or the Japanese city / Getty Images

The Knowledge Vault uses advanced natural language processing (NLP) techniques to gather material from the web and from Google's existing database of knowledge. It then generates a "knowledge-based trust score" for each source

Pictures of school dinners have been whizzing around the web recently. Neatly captioned with a country of origin, they each show a healthy, balanced meal featuring food typical of that country – a Caprese salad representing Italy, dolmades for Greek kids and so on. And then there's the picture from the US: a small pile of nondescript brown mush, sitting on a largely empty tray. This, explains the accompanying blurb, is the result of the "totalitarian" campaign by Michele Obama to raise standards of school meals.

In fact, all but one of the pictures are from a carefully styled photo shoot to promote a salad bar chain. The American one, however, is taken from a sarcastic Twitter hashtag, #thanksmicheleobama, which has almost become a competition to photograph the most unappetising meal you could possibly concoct in a school dining room. The photoset has been debunked by the web's premier debunkers – snopes.com, emergent.info and Adrienne LaFrance's Gawker blog, Antiviral – but it continues to rack up views on websites that uncritically relay the information. Somehow, through sheer repetition, it becomes truth.

This week, New Scientist magazine pondered the question of whether a piece of Google research could automate the process of determining said "truth". The Knowledge Vault uses advanced natural language processing (NLP) techniques to gather material from the web and from Google's existing database of knowledge. It then generates a "knowledge-based trust score" for each source. In a recent talk, Google research's Kevin Murphy explained the challenges of tackling contextual problems such as determining whether the word "Obama" might refer to Barack, Michele or the Japanese city of the same name. Knowledge Vault has already established millions of "confident facts" that are estimated as being at least 90 per cent true – but this is more a triumph of NLP than of computer reasoning. Knowledge Vault is only as "truthy" as the information it sucks up, and one wonders what it would make of the widely disseminated school dinner story. The vision posited by New Scientist is of search results ranked according to their truthfulness rather than their popularity – but will we ever be able to trust automated systems to recognise bullshit?

After all, establishing the truth can require probing, investigative work. As Gawker's LaFrance says, "One reason that so much gets shared is people not taking the time to put out a call… I'd rather go much deeper." Facts, especially political ones, are slippery fellows; the truth is often more mundane than it's made out to be: a grey, morphing mass that bobs around in an even larger grey area. But as we become swamped with information, we're more interested in saying "wow" than investigating the facts behind the "wow".

This is demonstrated on social media every day; easily shareable images, knocked up by someone who's a bit bored, end up driving opinion. The creators of those virals become heroic figures; the debunkers killjoys. The race to be first, to be popular, to be sensational, often works in direct opposition to truth – and that's undoubtedly a tussle that will continue deep within Google's top-secret search algorithm.

It's nice to imagine a future online environment where our gaze is directed away from dubious facts, but in the meantime it's down to us to treat sensational stories with the scepticism they deserve. (Ha! As if!)

Twitter.com/rhodri

Comments