How Google Translate works

The web giant's translation service might serve up the odd batch of nonsense, but it's still one of the smartest communication tools of all time, as David Bellos explains

Using software originally developed in the 1980s by researchers at IBM, Google has created an automatic translation tool that is unlike all others. It is not based on the intellectual presuppositions of early machine translation efforts – it isn't an algorithm designed only to extract the meaning of an expression from its syntax and vocabulary.

In fact, at bottom, it doesn't deal with meaning at all. Instead of taking a linguistic expression as something that requires decoding, Google Translate (GT) takes it as something that has probably been said before.

It uses vast computing power to scour the internet in the blink of an eye, looking for the expression in some text that exists alongside its paired translation.

The corpus it can scan includes all the paper put out since 1957 by the EU in two dozen languages, everything the UN and its agencies have ever done in writing in six official languages, and huge amounts of other material, from the records of international tribunals to company reports and all the articles and books in bilingual form that have been put up on the web by individuals, libraries, booksellers, authors and academic departments.

Drawing on the already established patterns of matches between these millions of paired documents, Google Translate uses statistical methods to pick out the most probable acceptable version of what's been submitted to it.

Much of the time, it works. It's quite stunning. And it is largely responsible for the new mood of optimism about the prospects for "fully automated high-quality machine translation".

Google Translate could not work without a very large pre-existing corpus of translations. It is built upon the millions of hours of labour of human translators who produced the texts that GT scours.

Google's own promotional video doesn't dwell on this at all. At present it offers two-way translation between 58 languages, that is 3,306 separate translation services, more than have ever existed in all human history to date.

Most of these translation relations – Icelandic to Farsi, Yiddish to Vietnamese, and dozens more – are the newborn offspring of Google Translate: there is no history of translation between them, and therefore no paired texts, on the web or anywhere else. Google's presentation of its service points out that given the huge variations between languages in the amount of material its program can scan to find solutions, translation quality varies according to the language pair involved.

What it does not highlight is that GT is as much the prisoner of global flows in translation as we all are. Its admirably smart probabilistic computational system can only offer 3,306 translation directions by using the same device as has always assisted intercultural communication: pivots, or intermediary languages.

It's not because Google is based in California that English is the main pivot. If you use statistical methods to compute the most likely match between languages that have never been matched directly before, you must use the pivot that can provide matches with both target and source.

A good number of English-language detective novels, for example, have probably been translated into both Icelandic and Farsi. They thus provide ample material for finding matches between sentences in the two foreign languages; whereas Persian classics translated into Icelandic are surely far fewer, even including those works that have themselves made the journey by way of a pivot such as French or German. This means that John Grisham makes a bigger contribution to the quality of GT's Icelandic-Farsi translation device than Rumi or Halldór Laxness ever will. And the real wizardry of Harry Potter may well lie in his hidden power to support translation from Hebrew into Chinese. GT-generated translations themselves go up on the web and become part of the corpus that GT scans, producing a feedback loop that reinforces the probability that the original GT translation was acceptable. But it also feeds on human translators, since it always asks users to suggest a better translation than the one it provides – a loop pulling in the opposite direction, towards greater refinement. It's an extraordinarily clever device. I've used it myself to check I had understood a Swedish sentence more or less correctly, for example, and it is used automatically as a webpage translator whenever you use a search engine.

Of course, it may also produce nonsense. However, the kind of nonsense a translation machine produces is usually less dangerous than human-sourced bloopers. You can usually see instantly when GT has failed to get it right, because the output makes no sense, and so you disregard it. (This is why you should never use GT to translate into a language you do not know very well. Use it only to translate into a language in which you are sure you can recognise nonsense.)

Human translators, on the other hand, produce characteristically fluent and meaningful output, and you really can't tell if they are wrong unless you also understand the source – in which case you don't need the translation at all.

If you remain attached to the idea that a language really does consist of words and rules and that meaning has a computable relationship to them (a fantasy that many philosophers still cling to), then GT is not a translation device. It's just a trick performed by an electronic bulldozer allowed to steal other people's work. But if you have a more open mind, GT suggests something else.

Conference interpreters can often guess ahead of what a speaker is saying because speakers at international conferences repeatedly use the same formulaic expressions. Similarly, an experienced translator working in a familiar domain knows without thinking that certain chunks of text have standard translations that he or she can slot in.

Translators don't reinvent hot water every day. They behave more like GT – scanning their own memories in double-quick time for the most probable solution to the issue at hand. GT's basic mode of operation is much more like professional translation than is the slow descent into the "great basement" of pure meaning that early mechanical translation developers imagined.

GT is also a splendidly cheeky response to one of the great myths of modern language studies. It was claimed, and for decades it was barely disputed, that what was so special about a natural language was that its underlying structure allowed an infinite number of different sentences to be generated by a finite set of words and rules.

A few wits pointed out that this was no different from a British motor car plant, capable of producing an infinite number of vehicles each one of which had something different wrong with it – but the objection didn't make much impact outside Oxford.

GT deals with translation on the basis not that every sentence is different, but that anything submitted to it has probably been said before. Whatever a language may be in principle, in practice it is used most commonly to say the same things over and over again. There is a good reason for that. In the great basement that is the foundation of all human activities, including language behaviour, we find not anything as abstract as "pure meaning", but common human needs and desires.

All languages serve those same needs, and serve them equally well. If we do say the same things over and over again, it is because we encounter the same needs, feel the same fears, desires and sensations at every turn. The skills of translators and the basic design of GT are, in their different ways, parallel reflections of our common humanity.

This is an extract from 'Is That A Fish In Your Ear: Translation and the Meaning of Everything' by David Bellos published by Particular (£20). To order a copy for the special price of £16.50 (free P&P), call Independent Books Direct on 08430 600 030, or visit independentbooksdirect.co.uk

Sport
footballHe started just four months ago
News
Nigel Farage celebrates with a pint after early local election results in the Hoy and Helmet pub in South Benfleet in Essex
peopleHe has shaped British politics 'for good or ill'
News
One father who couldn't get One Direction tickets for his daughters phoned in a fake bomb threat and served eight months in a federal prison
people... (and one very unlucky giraffe)
Arts and Entertainment
Sink the Pink's 2013 New Year's Eve party
musicFour of Britain's top DJs give their verdict on how to party into 2015
PROMOTED VIDEO
Life and Style
ebookNow available in paperback
ebooks
ebookPart of The Independent’s new eBook series The Great Composers
Latest stories from i100
Have you tried new the Independent Digital Edition apps?
Independent Dating
and  

By clicking 'Search' you
are agreeing to our
Terms of Use.

ES Rentals

    iJobs Job Widget
    iJobs Gadgets & Tech

    Ashdown Group: Moodle Developer (PHP ,Linux, Apache, MySQL, Moodle)

    £35000 - £45000 per annum: Ashdown Group: Moodle Developer (PHP ,Linux, Apache...

    Recruitment Genius: Web Developer

    £17000 - £30000 per annum: Recruitment Genius: This is a fantastic opportunity...

    Recruitment Genius: Junior .NET Web Developer - Winform / MVC

    £21000 - £26000 per annum: Recruitment Genius: This Award-winning pharma softw...

    Recruitment Genius: Senior Java Developer

    £30000 - £45000 per annum: Recruitment Genius: A Senior Java Developer is requ...

    Day In a Page

    Aren’t you glad you didn’t say that? The worst wince-and-look-away quotes of the year

    Aren’t you glad you didn’t say that?

    The worst wince-and-look-away quotes of the year
    Hollande's vanity project is on a high-speed track to the middle of nowhere

    Vanity project on a high-speed track to nowhere

    France’s TGV network has become mired in controversy
    Sports Quiz of the Year

    Sports Quiz of the Year

    So, how closely were you paying attention during 2014?
    Alexander Armstrong on insulting Mary Berry, his love of 'Bargain Hunt', and life as a llama farmer

    Alexander Armstrong on insulting Mary Berry and his love of 'Bargain Hunt'

    From Armstrong and Miller to Pointless
    Sanchez helps Gunners hold on after Giroud's moment of madness

    Sanchez helps Gunners hold on

    Olivier Giroud's moment of madness nearly costs them
    A Christmas without hope: Fears grow in Gaza that the conflict with Israel will soon reignite

    Christmas without hope

    Gaza fears grow that conflict with Israel will soon reignite
    After 150 years, you can finally visit the grisliest museum in the country

    The 'Black Museum'

    After 150 years, you can finally visit Britain's grisliest museum
    No ho-ho-hos with Nick Frost's badass Santa

    No ho-ho-hos with Nick Frost's badass Santa

    Doctor Who Christmas Special TV review
    Chilly Christmas: Swimmers take festive dip for charity

    Chilly Christmas

    Swimmers dive into freezing British waters for charity
    Veterans' hostel 'overwhelmed by kindness' for festive dinner

    Homeless Veterans appeal

    In 2010, Sgt Gary Jamieson stepped on an IED in Afghanistan and lost his legs and an arm. He reveals what, and who, helped him to make a remarkable recovery
    Isis in Iraq: Yazidi girls killing themselves to escape rape and imprisonment by militants

    'Jilan killed herself in the bathroom. She cut her wrists and hanged herself'

    Yazidi girls killing themselves to escape rape and imprisonment
    Ed Balls interview: 'If I think about the deficit when I'm playing the piano, it all goes wrong'

    Ed Balls interview

    'If I think about the deficit when I'm playing the piano, it all goes wrong'
    He's behind you, dude!

    US stars in UK panto

    From David Hasselhoff to Jerry Hall
    Grace Dent's Christmas Quiz: What are you – a festive curmudgeon or top of the tree?

    Grace Dent's Christmas Quiz

    What are you – a festive curmudgeon or top of the tree?
    Nasa planning to build cloud cities in airships above Venus

    Nasa planning to build cloud cities in airships above Venus

    Planet’s surface is inhospitable to humans but 30 miles above it is almost perfect