How Google Translate works

The web giant's translation service might serve up the odd batch of nonsense, but it's still one of the smartest communication tools of all time, as David Bellos explains

Using software originally developed in the 1980s by researchers at IBM, Google has created an automatic translation tool that is unlike all others. It is not based on the intellectual presuppositions of early machine translation efforts – it isn't an algorithm designed only to extract the meaning of an expression from its syntax and vocabulary.

In fact, at bottom, it doesn't deal with meaning at all. Instead of taking a linguistic expression as something that requires decoding, Google Translate (GT) takes it as something that has probably been said before.

It uses vast computing power to scour the internet in the blink of an eye, looking for the expression in some text that exists alongside its paired translation.

The corpus it can scan includes all the paper put out since 1957 by the EU in two dozen languages, everything the UN and its agencies have ever done in writing in six official languages, and huge amounts of other material, from the records of international tribunals to company reports and all the articles and books in bilingual form that have been put up on the web by individuals, libraries, booksellers, authors and academic departments.

Drawing on the already established patterns of matches between these millions of paired documents, Google Translate uses statistical methods to pick out the most probable acceptable version of what's been submitted to it.

Much of the time, it works. It's quite stunning. And it is largely responsible for the new mood of optimism about the prospects for "fully automated high-quality machine translation".

Google Translate could not work without a very large pre-existing corpus of translations. It is built upon the millions of hours of labour of human translators who produced the texts that GT scours.

Google's own promotional video doesn't dwell on this at all. At present it offers two-way translation between 58 languages, that is 3,306 separate translation services, more than have ever existed in all human history to date.

Most of these translation relations – Icelandic to Farsi, Yiddish to Vietnamese, and dozens more – are the newborn offspring of Google Translate: there is no history of translation between them, and therefore no paired texts, on the web or anywhere else. Google's presentation of its service points out that given the huge variations between languages in the amount of material its program can scan to find solutions, translation quality varies according to the language pair involved.

What it does not highlight is that GT is as much the prisoner of global flows in translation as we all are. Its admirably smart probabilistic computational system can only offer 3,306 translation directions by using the same device as has always assisted intercultural communication: pivots, or intermediary languages.

It's not because Google is based in California that English is the main pivot. If you use statistical methods to compute the most likely match between languages that have never been matched directly before, you must use the pivot that can provide matches with both target and source.

A good number of English-language detective novels, for example, have probably been translated into both Icelandic and Farsi. They thus provide ample material for finding matches between sentences in the two foreign languages; whereas Persian classics translated into Icelandic are surely far fewer, even including those works that have themselves made the journey by way of a pivot such as French or German. This means that John Grisham makes a bigger contribution to the quality of GT's Icelandic-Farsi translation device than Rumi or Halldór Laxness ever will. And the real wizardry of Harry Potter may well lie in his hidden power to support translation from Hebrew into Chinese. GT-generated translations themselves go up on the web and become part of the corpus that GT scans, producing a feedback loop that reinforces the probability that the original GT translation was acceptable. But it also feeds on human translators, since it always asks users to suggest a better translation than the one it provides – a loop pulling in the opposite direction, towards greater refinement. It's an extraordinarily clever device. I've used it myself to check I had understood a Swedish sentence more or less correctly, for example, and it is used automatically as a webpage translator whenever you use a search engine.

Of course, it may also produce nonsense. However, the kind of nonsense a translation machine produces is usually less dangerous than human-sourced bloopers. You can usually see instantly when GT has failed to get it right, because the output makes no sense, and so you disregard it. (This is why you should never use GT to translate into a language you do not know very well. Use it only to translate into a language in which you are sure you can recognise nonsense.)

Human translators, on the other hand, produce characteristically fluent and meaningful output, and you really can't tell if they are wrong unless you also understand the source – in which case you don't need the translation at all.

If you remain attached to the idea that a language really does consist of words and rules and that meaning has a computable relationship to them (a fantasy that many philosophers still cling to), then GT is not a translation device. It's just a trick performed by an electronic bulldozer allowed to steal other people's work. But if you have a more open mind, GT suggests something else.

Conference interpreters can often guess ahead of what a speaker is saying because speakers at international conferences repeatedly use the same formulaic expressions. Similarly, an experienced translator working in a familiar domain knows without thinking that certain chunks of text have standard translations that he or she can slot in.

Translators don't reinvent hot water every day. They behave more like GT – scanning their own memories in double-quick time for the most probable solution to the issue at hand. GT's basic mode of operation is much more like professional translation than is the slow descent into the "great basement" of pure meaning that early mechanical translation developers imagined.

GT is also a splendidly cheeky response to one of the great myths of modern language studies. It was claimed, and for decades it was barely disputed, that what was so special about a natural language was that its underlying structure allowed an infinite number of different sentences to be generated by a finite set of words and rules.

A few wits pointed out that this was no different from a British motor car plant, capable of producing an infinite number of vehicles each one of which had something different wrong with it – but the objection didn't make much impact outside Oxford.

GT deals with translation on the basis not that every sentence is different, but that anything submitted to it has probably been said before. Whatever a language may be in principle, in practice it is used most commonly to say the same things over and over again. There is a good reason for that. In the great basement that is the foundation of all human activities, including language behaviour, we find not anything as abstract as "pure meaning", but common human needs and desires.

All languages serve those same needs, and serve them equally well. If we do say the same things over and over again, it is because we encounter the same needs, feel the same fears, desires and sensations at every turn. The skills of translators and the basic design of GT are, in their different ways, parallel reflections of our common humanity.

This is an extract from 'Is That A Fish In Your Ear: Translation and the Meaning of Everything' by David Bellos published by Particular (£20). To order a copy for the special price of £16.50 (free P&P), call Independent Books Direct on 08430 600 030, or visit independentbooksdirect.co.uk

Sport
Mourinho lost his temper as well as the match
sportLiverpool handed title boost as Sunderland smash manager’s 77-game home league run
Voices
Sweet tweet: Victoria Beckham’s selfie, taken on her 40th birthday on Thursday
voices... and her career-long attack on the absurd criteria by which we define our 'betters', by Ellen E Jones
Arts & Entertainment
Billie Jean King, who won the women’s Wimbledon title in 1967, when the first colour pictures were broadcast
tv
News
Snow has no plans to step back or reduce his workload
mediaIt's 25 years since Jon Snow first presented Channel 4 News, and his drive shows no sign of diminishing
VIDEO
Life & Style
food + drinkWhat’s not to like?
Voices
Clock off: France has had a 35‑hour working week since 1999
voicesThere's no truth to a law banning work emails after 6pm, but that didn’t stop media hysteria
Arts & Entertainment
Maisie Williams of Game of Thrones now
tvMajor roles that grow with their child actors are helping them to steal the show on TV
Life & Style
Lana Del Rey, Alexa Chung and Cara Delevingne each carry their signature bag
fashionMulberry's decision to go for the super-rich backfired dramatically
Arts & Entertainment
Kingdom Tower
architecture
Life & Style
Sampling wine in Turin
food + drink...and abstaining may be worse than drinking too much, says scientist
Arts & Entertainment
Game of Thrones writer George R.R. Martin has been working on the novels since the mid-Nineties
books
News
Easter a dangerous time for dogs
these are the new ones. Old ones are below them... news
News
Brand said he
people
Voices
Actor Zac Efron
voicesTopless men? It's as bad as Page 3, says Howard Jacobson
Sport
Roger Federer celebrates his victory over Novak Djokovic in the Monte Carlo Masters
sport
Arts & Entertainment
The monster rears its head as it roars into the sky
film
Voices
For the Love of God (2007) The diamond-encrusted skull that divided the art world failed to sell for
its $100m asking price. It was eventually bought by a consortium
which included the artist himself.
voicesYou can shove it, Mr Webb – I'll be having fun until the day I die, says Janet Street-Porter
Independent
Travel Shop
the manor
Up to 70% off luxury travel
on city breaks Find out more
santorini
Up to 70% off luxury travel
on chic beach resorts Find out more
sardina foodie
Up to 70% off luxury travel
on country retreats Find out more
Have you tried new the Independent Digital Edition iPad app?
Independent Dating
and  

By clicking 'Search' you
are agreeing to our
Terms of Use.

ES Rentals

    iJobs Job Widget
    iJobs Gadgets & Tech

    Apprentice IT Technician

    £150.00 per week: QA Apprenticeships: This company is a company that specializ...

    1st Line Technical Service Desk Analyst IT Apprentice

    £153.75 per week: QA Apprenticeships: This company is an innovative outsourcin...

    Sales Associate Apprentice

    £150.00 per week: QA Apprenticeships: We've been supplying best of breed peopl...

    Apprentice C# .NET Developer

    £150.00 per week: QA Apprenticeships: We provide business administration softw...

    Day In a Page

    How I brokered a peace deal with Robert Mugabe: Roy Agyemang reveals the delicate diplomacy needed to get Zimbabwe’s President to sit down with the BBC

    How I brokered a peace deal with Robert Mugabe

    Roy Agyemang reveals the delicate diplomacy needed to get Zimbabwe’s President to sit down with the BBC
    Video of British Muslims dancing to Pharrell Williams's hit Happy attacked as 'sinful'

    British Muslims's Happy video attacked as 'sinful'

    The four-minute clip by Honesty Policy has had more than 300,000 hits on YouTube
    Church of England-raised Michael Williams describes the unexpected joys in learning about his family's Jewish faith

    Michael Williams: Do as I do, not as I pray

    Church of England-raised Williams describes the unexpected joys in learning about his family's Jewish faith
    A History of the First World War in 100 moments: A visit to the Front Line by the Prime Minister's wife

    A History of the First World War in 100 moments

    A visit to the Front Line by the Prime Minister's wife
    Comedian Jenny Collier: 'Sexism I experienced on stand-up circuit should be extinct'

    Jenny Collier: 'Sexism on stand-up circuit should be extinct'

    The comedian's appearance at a show on the eve of International Women's Day was cancelled because they had "too many women" on the bill
    Cannes Film Festival: Ken Loach and Mike Leigh to fight it out for the Palme d'Or

    Cannes Film Festival

    Ken Loach and Mike Leigh to fight it out for the Palme d'Or
    The concept album makes surprise top ten return with neolithic opus from Jethro Tull's Ian Anderson

    The concept album makes surprise top ten return

    Neolithic opus from Jethro Tull's Ian Anderson is unexpected success
    Lichen is the surprise new ingredient on fine-dining menus, thanks to our love of Scandinavian and Indian cuisines

    Lichen is surprise new ingredient on fine-dining menus

    Emily Jupp discovers how it can give a unique, smoky flavour to our cooking
    10 best baking books

    10 best baking books

    Planning a spot of baking this bank holiday weekend? From old favourites to new releases, here’s ten cookbooks for you
    Jury still out on Manchester City boss Manuel Pellegrini

    Jury still out on Pellegrini

    Draw with Sunderland raises questions over Manchester City manager's ability to motivate and unify his players
    Ben Stokes: 'Punching lockers isn't way forward'

    Ben Stokes: 'Punching lockers isn't way forward'

    The all-rounder has been hailed as future star after Ashes debut but incident in Caribbean added to doubts about discipline. Jon Culley meets a man looking to control his emotions
    Mark Johnston: First £1 million jackpot spurs him on

    Mark Johnston: First £1 million jackpot spurs him on

    The most prize money ever at an All-Weather race day is up for grabs at Lingfield on Friday, and the record-breaking trainer tells Jon Freeman how times have changed
    Ricky Gervais: 'People are waiting for me to fail. If you think it's awful, then just don't watch it'

    Ricky Gervais: 'People are waiting for me to fail'

    As the second series of his divisive sitcom 'Derek' hits screens, the comedian tells James Rampton why he'll never bow to the critics who habitually circle his work
    Mad Men series 7, TV review: The suits are still sharp, but Don Draper has lost his edge

    Mad Men returns for a final fling

    The suits are still sharp, but Don Draper has lost his edge
    Google finds a lift into space will never get off the ground as there is no material strong enough for a cable from Earth into orbit

    Google finds a lift into space will never get off the ground

    Technology giant’s scientists say there is no material strong enough for a cable from Earth into orbit