Can Google's The Knowledge Vault automate the process of determining the 'truth'?

The Knowledge Vault uses advanced natural language processing (NLP) techniques to gather material from the web and from Google's existing database of knowledge. It then generates a "knowledge-based trust score" for each source

Wednesday 04 March 2015 21:26 GMT

Google research's Kevin Murphy explained the challenges of tackling contextual problems such as determining whether the word Obama might refer to Barack, Michele (pictured) or the Japanese city (Getty Images)

Your support helps us to tell the story

Support Now

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

Pictures of school dinners have been whizzing around the web recently. Neatly captioned with a country of origin, they each show a healthy, balanced meal featuring food typical of that country – a Caprese salad representing Italy, dolmades for Greek kids and so on. And then there's the picture from the US: a small pile of nondescript brown mush, sitting on a largely empty tray. This, explains the accompanying blurb, is the result of the "totalitarian" campaign by Michele Obama to raise standards of school meals.

In fact, all but one of the pictures are from a carefully styled photo shoot to promote a salad bar chain. The American one, however, is taken from a sarcastic Twitter hashtag, #thanksmicheleobama, which has almost become a competition to photograph the most unappetising meal you could possibly concoct in a school dining room. The photoset has been debunked by the web's premier debunkers – snopes.com, emergent.info and Adrienne LaFrance's Gawker blog, Antiviral – but it continues to rack up views on websites that uncritically relay the information. Somehow, through sheer repetition, it becomes truth.

This week, New Scientist magazine pondered the question of whether a piece of Google research could automate the process of determining said "truth". The Knowledge Vault uses advanced natural language processing (NLP) techniques to gather material from the web and from Google's existing database of knowledge. It then generates a "knowledge-based trust score" for each source. In a recent talk, Google research's Kevin Murphy explained the challenges of tackling contextual problems such as determining whether the word "Obama" might refer to Barack, Michele or the Japanese city of the same name. Knowledge Vault has already established millions of "confident facts" that are estimated as being at least 90 per cent true – but this is more a triumph of NLP than of computer reasoning. Knowledge Vault is only as "truthy" as the information it sucks up, and one wonders what it would make of the widely disseminated school dinner story. The vision posited by New Scientist is of search results ranked according to their truthfulness rather than their popularity – but will we ever be able to trust automated systems to recognise bullshit?

After all, establishing the truth can require probing, investigative work. As Gawker's LaFrance says, "One reason that so much gets shared is people not taking the time to put out a call… I'd rather go much deeper." Facts, especially political ones, are slippery fellows; the truth is often more mundane than it's made out to be: a grey, morphing mass that bobs around in an even larger grey area. But as we become swamped with information, we're more interested in saying "wow" than investigating the facts behind the "wow".

This is demonstrated on social media every day; easily shareable images, knocked up by someone who's a bit bored, end up driving opinion. The creators of those virals become heroic figures; the debunkers killjoys. The race to be first, to be popular, to be sensational, often works in direct opposition to truth – and that's undoubtedly a tussle that will continue deep within Google's top-secret search algorithm.

It's nice to imagine a future online environment where our gaze is directed away from dubious facts, but in the meantime it's down to us to treat sensational stories with the scepticism they deserve. (Ha! As if!)

Twitter.com/rhodri

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Stay up to date with notifications from The Independent

Thank you for registering

Can Google's The Knowledge Vault automate the process of determining the 'truth'?

The Knowledge Vault uses advanced natural language processing (NLP) techniques to gather material from the web and from Google's existing database of knowledge. It then generates a "knowledge-based trust score" for each source

Join our commenting forum

Thank you for registering