In 1991, Newton wrote to Bill Gates, outlining his Big Idea. He proposed that Bloomsbury and Microsoft collaborate upon the world's first digitally compiled dictionary. It would be the first dictionary to be written simultaneously for both print and electronic production. Most importantly, it would be the first dictionary to be compiled from a global, rather than a national or cultural perspective, so that English-speakers across the world would be able to use it with equal ease. A week later, the vice principal of Microsoft was in the Bloomsbury offices to discuss terms, and Newton began recruiting lexicographers to compile the Encarta World English Dictionary.
A dictionary can never be a wholly transparent account of a language. If you can prescribe language then you can prescribe thought - or so the theory goes. That's why Big Brother loved his Newspeak, and why Gramsci believed that the standardisation of the Italian language was the path to revolutionary victory. And from the Concise Oxford Dictionary to The Klingon Dictionary, you can see all sorts of ideological tensions at work in every lexicon. The selection and definition of words, the rendering of their pronunciation - all these reflect the tastes and prejudices of the editors.
For instance, the word chugging will appear in the Third Edition of the Oxford English Dictionary, although few outside the initiation ceremonies of Exeter College rugby team will know that it refers to the practice of drinking a pint of lager as it courses down a fellow sportsman's bum- crack.
Less disgustingly, there was much debate at the department responsible for the OED several years ago about whether the word suit should be phoneticised as syoot or soot. Switch from the former to the latter, and they would only offend a bunch of Roedean old girls. Globalise issues like this, however, and they become diplomatic disasters waiting to happen. The editors of the Oxford Dictionary of South African English had to exercise terrific delicacy when assessing the political implications of their work. One of their central problems was how to phoneticize the non-verbal clicks of Zulu speech without patronising or alienating potential Zulu readers. If you've ever seen your own dialect rendered on-page by a non-native speaker - Dickens' attempt at the Lancashire accent in Hard Times, for instance - you'll know how galling this can be.
Even dictionaries of fictional languages can exhibit some of the same processes. Mark Okrand's lexicon of the Klingon tongue - spoken only by latex-smothered actors and loons at Star Trek conventions - not only informs the reader that "Regnulus wghargh" is Klingon for Regulan bloodworm, but that "English is an officer-class language". This is presumably an attempt to explain why most of these creatures bellow in stentorian, Masterpiece Theatre American. However, as the Klingons are explicitly intended to be allegorised versions of the Russians, the book also functions as a revealing post-Cold War document.
First problems first, however. What should one call a book full of words? Until the beginning of the 18th century, a number of terms competed for the privilege of attachment to these texts. Abecedarium - reflecting the satisfying order of the alphabet; alveary (literally, beehive) - to conjure the Babel buzz of language; ortus (garden) or sylva (wood) - to plant the idea of variety, plenty, and vigorous growth. Medulla - to suggest getting to the pith and marrow of English, or manipulus, to promise to hunker words up by the fistful. By 1800, however, lexicographers had settled upon one term: dictionary, from the Latin dictionarius, meaning a repertory of words.
Today we judge a dictionary on the breadth and weight of that repertory. The Encarta World English Dictionary is very proud of its 400,000 references, 2,200 pages and 3,000 illustrations. However, the first English dictionary - Robert Cawdrey's A Table Alphabeticall, Contayning and Teaching the True Writing and Understanding of Hard Usuall English Words, borrowed from Hebrew, Greeke, Latin, or French (1604) - was only 120 pages long, and listed barely 3,000 entries. Rather than being an exhaustive description of a language, it was a variation on the etiquette book, spelling out difficult words so that the upwardly mobile wouldn't blab their inadequacies through their correspondence. If you were a woman, or were too poor to have received a grammar school education, Cawdrey's lexicon was just the thing to conceal those deficiencies.
His imitators continued to chase the same semi-educated demographic. In 1623, Henry Cockeram's dictionary offered a table of "vulgar" words, listing elegant equivalents into which the patchily educated could transpose their yahooisms. John Kersey's 1702 work was the first to include common as well as unusual words in its pages, but it wasn't until Nathaniel Bailey's Dictionarium Brittanicum (1730) that the form became a popular success. The reason for this, of course, is that he was the first to include the mucky words. After Samuel Johnson's Dictionary of the English Language (1755) hit the bookstands, however, few readers took up their Bailey again.
Johnson's Dictionary is an overtly political document, full of the partialities and hobbyhorses of its compiler. Excise is defined as "a hateful tax"; a lexicographer is "a harmless drudge", oat is "a grain, which in England is generally given to horses, but in Scotland supports the people".
More broadly, Johnson's Dictionary was a means of reducing the influence on British society of those dreadful Frenchies. Londoners were sporting Gallic wigs and beauty spots, and trying to sound sophisticated by using bons mots picked up in Continental salons. Johnson - whose only trip abroad was a quick jaunt to Paris in 1775 - was considerably more Eurosceptic. "Our language, for almost a century has, by the concurrence of many causes, been gradually departing from its original Teutonic character, and deviating towards a Gallic structure and phraseology, from which it ought to be our endeavour to recall it."
As modern business activities have boosted the expansion of World English, so the workings of Enlightenment capitalism allowed European languages to make incursions into foreign territory. "Commerce," pronounced Johnson, "however necessary, however lucrative, as it depraves the manners, corrupts the language; they that have frequent intercourse with strangers, to whom they endeavour to accommodate themselves, must in time learn a mingled dialect."
In effect, Johnson was arguing for a form of RP (received pronunciation). He made official the language of his own class and lost little sleep over his failure to record the language of the people who steamed his puddings or lit his fire. "That many terms of art and manufacture are omitted must be frankly acknowledged," he wrote in the Preface to the Dictionary, "but for this defect I may boldly allege that it was unavoidable: I could not visit caverns to learn the miner's language ... nor visit the warehouses of merchants and shops of artificers to gain the names of words, tools and operations not found in books."
I, for one, don't buy this excuse. Johnson went to the trouble of travelling to the Hebrides to write A Journey to the Western Islands of Scotland (1775). He didn't enjoy the trip very much, but at least he didn't write the book from his desk in London. There was nothing to stop him taking a coach to the Forest of Dean, and getting a bit of coal dust on his wig in order to collect a few nuggets of the miners' subterranean language. Even less to stop him wandering over to Bankside and passing half an hour with the warehousemen and dockers. Did he decide to mislay the language of the poor because he was once one of them? Or because he angsted all his life about Pembroke College's long refusal to award him an Oxford degree?
If Johnson had been a bit more flexible, he might have bequeathed us a lexicon of the lost languages of Enlightenment England: The forest dialect of charcoal-burners, the street-slang of sneak-thieves and cutpurses, the bitchy Polari of molly-house drag queens, the back-kitchen patois of African page boys. Instead, they were condemned to silence.
However, Johnson's one-man police force had no chance of baton-charging popular non-native words from the vocabulary. English has always been an inclusive language: 80 per cent of its words - from chocolate to banana, wigwam, outback, gorilla and tea - are of foreign origin. Unlike French - which has the Academie Francais on gendarme duty - it has no xenophobic door policy. Most of the expressions that Johnson so despised have remained de rigueur. Today, British schoolchildren have absorbed swathes of Australian slang, imported into currency via Neighbours and Home and Away.
Bloomsbury's Nigel Newton observes this process every day. "We're watching American TV shows and Hollywood movies. American is the language of Internet and software. Indian movies are growing in popularity. I'm half English and half American. We're very aware of kinds of English like black American street language. My kids use that, without really even knowing why. It's permeated their sensibility."
Are other vocabularies under threat from the world's new favourite language? Encarta's editor Dr Kathy Rooney contends that the growth of World English need not silence other tongues: "Having one language that is spoken in various countries, that is an international medium of communication, should in many ways be a comfort to other languages. English is not trying to force other languages out. If you look at India or South Africa, the multiplicity of languages is preserved by having an international alternative language."
Should we remember what happened to Irish and Welsh, and treat this with scepticism? Or should we learn to stop worrying and love World English?
Oddly enough, Dr Strangelove provides an insight into the current success of World English. After the end of the Cold War, the machinery by which English was promulgated for propaganda reasons began to run down. The Voice of America mellowed, losing its lunatic edge. In turn, the forbidding radio stations of the East faded into silence. The excitingly stern announcer of Radio Tirana stopped telling the world in boastful phonetic English that the number of Albanians in higher education had risen to 1 per cent. And at the moment this battle ended, the new democracies of Eastern Europe embraced the English language with a voracious, goggle-eyed enthusiasm.
As a beneficiary of this process, this is an issue upon which I find it difficult to take a detached stance. In 1990, without a single reference or teaching qualification, I managed to get a summers' employment in a language school in Poznan, in north-west Poland. My pupils were the children of new capitalists, kids whose parents sent them to be schooled in the only commercial language that mattered. In one class I taught the son of Poland's largest lingerie manufacturer, placed there in the hope that he would, one day, be able to broker those girdle deals in perfect RP. In another, there was the son of a crisp magnate, whose completely inedible product - like hyperinflated Cheesey Wotsits without the cheese flavouring - would only gain a foothold in foreign markets if young Boleslaw (or whoever) could be trained to produce nibble hard-talk in the language of Shakespeare and Milton. This ethos was everywhere.
As capitalism spread through Eastern Europe, World English continued to grow, boosted by the Anglophone nature of the Internet. Business conducted in the English language began to produce $7,815bn annually. The number of people learning English topped a billion. So for Rupert Murdoch, Bill Gates and me, the good times rolled. When I was doing postgraduate study at university, I supplemented my British Academy funding with two World English-related jobs. The first involved teaching English to the children of Russian mafiosi in a wood-panelled crammer college. The second found me compiling articles for the Encarta CD-Rom Encyclopaedia. I did Elizabeth Barrett Browning, Wilkie Collins, Arthur Conan Doyle, WB Yeats, Dylan Thomas, DH Lawrence, HG Wells, and a handful of others. Without World English, Microsoft and the odd Muscovite Mr Big, I wouldn't have got through college.
According to the Encarta World English Dictionary, World English is "the English language in all its varieties as it is spoken and written over the world". The nearest comparison comes from the Ancient World. During the period of the Roman Empire, Latin was the language of administration, government, literature and scholarship in Europe, Asia and Africa. Its proliferation was facilitated by the technology of the day: the stylus, the codex, the dead-straight roads which allowed the Romans to transport documents as well as armies.
Now it's happening all over again, but on a scale that Julius Caesar could never have envisaged. Today, over 750 million people speak English as a native or second language. Some 85 per cent of the world's Internet webpages are in English. English is the most common language of intercontinental telephone communication, book production and broadcasting. It is the language of academia, of the military, and of air-traffic control. And it has not yet stopped growing. By 2050, over 90 per cent of the world's population may be speaking the same language as you.
EWED has been constructed with this process - and market - in mind. Its contents have been produced from a vast corpus of 50 million World English words - contributed by the Englishes of, to name a few, Liberia, Trinidad, Canada, Malaysia and Ireland - and assembled by 320 lexicographers all over the globe.
The dictionary is strong on shiny new arrivals: morning-after pill, retro, alcopop and Millennium Dome are all prominent on its pages. Net and computing terms are forcefully represented, too - and perhaps over-represented, in the same way that Empire Hobson-Jobsonisms are rather too numerous in the Oxford English Dictionary, compiled at the zenith of the Raj. Internet Service Provider, DOS, millennium bug and Y2K have their own entries, as do computer viruses like Worm and Trojan Horse and Emoticons - those bursts of e-mail shorthand such as :-) for happy, ;-) for wink etc - are also listed.
Other inclusions and omissions seem rather arbitrary. You'll find Wimpy ("a trademark for a type of hamburger"), but not McDonald's, surely one of the most widely recognised trade names on the planet. Queercore ("a gay youth movement that rejects the stereotype of the gay person as persecuted victim") is present, but not loungecore. Dalek is there, but not Tardis - despite its ubiquitous use in estate agents' literature as a way of suggesting that a flat isn't quite as poky as it looks. Tarzan is present (but without an explanation of his fictional origins). Peter Pan gets an entry, but the single epithet Sherlock stands in for Sherlock Holmes. Have you ever heard anyone say, "He's a real Sherlock?" Maybe in Sierra Leone. As for Moriarty (often used, I'd contend, to mean an arch nemesis), he's obviously tumbled over some lexicographical equivalent of the Reichenbach Falls.
As the decisions of EWED's editor are final, all these quirks are acceptable. The biographical entries, however, reveal some serious omissions. Margaret Drabble is there, but not Beryl Bainbridge or Fay Weldon. Tom Stoppard and Alan Ayckbourn get an entry, but not Harold Pinter.
More telling, perhaps, is the way that the two main financial beneficiaries of the spread of World English are portrayed. Rupert Murdoch cuts a slightly sinister figure, a "media proprietor [who] extended his family newspaper empire to control a global network of media organisations". While Bill Gates' entry describes him as an "entrepreneur [who] co-founded Microsoft Corporation to develop the DOS operating system, and developed a major international telecommunications corporation". Above these words there's a photograph of the man, his hands fanned out, as if in explanation of what a decent guy he is.
But is Microsoft, I wonder, any less imperial than News International? Gates has, after all, fought a court battle over his alleged attempts to monopolise access to the Internet. Is he attempting the same trick with the English language? The proof copy of EWED - which sits proudly on a shelf in Nigel Newton's office - has the words "The New Global Authority" flashed across the dust jacket. The equally tomorrow-belongs-to-me-ish "One World - One Dictionary" remains on the back cover.
To say that Newton and Rooney are prepared for the question might require EWED's definition of understatement to be redrafted. "We have no doubt that this will be a popular theme among some of your fellow tribe-members," reflects Newton. "But if you look into this story, Bill Gates is bulletproof on that point. Kathy and I created this dictionary. After doing so, we offered the electronic rights to Microsoft. At the same time we were offering similar rights to other people. Nobody is going to accuse, say, our American print partners St Martin's Press of trying to take over the English language. It simply isn't true. But it's a headline that sounds good, so I expect we'll be reading a lot of it in the next fortnight. Satisfied?"
For one measure of the quality of a dictionary, look to its definitions of taboo words. Not just the mucky ones - which, if dictionaries could speak, would elicit a groan of boredom at each consultation ("Not arse again! Try alembic or seniti!"). No, I mean the really unpleasant words which all dictionaries must include, but with which none can feel comfortable.
Here's a cautionary tale from recent lexicographical history. In October 1997, the Merriam-Webster Collegiate Dictionary was denounced by the NAACP (National Association for the Advancement of Colored People) for including this definition of the word nigger. "Nigger 1: a black person usu. taken to be offensive 2: a member of any dark-skinned race usu. taken to be offensive 3: a member of a socially disadvantaged class of persons."
Kweisi Mfume, director of the NAACP, launched a campaign against the book: "The NAACP finds it objectionable that Merriam-Webster would use black people as a definition for a racist term. A 'nigger' is not a black person or a member of a dark-skinned race as defined by Merriam-Webster. It is not a definition of a person's race, but a derogatory word."
Rather than conceding insensitivity, Merriam-Webster chose to defend their decision: "We have tried to make it clear that the use of this word as a racial slur is abhorrent, but it is nonetheless part of the language and, as such, it is our duty to report on it." Instead of amplifying the word's offensive status in its definition, they added a mealy-mouthed usage note which pointed out that "although used by writers like Twain, Conrad and Dickens, 'nigger' is perhaps the most offensive and inflammatory racial slur in English." Many American colleges have yet to lift their boycott on the book.
A global project like EWED has been forced to traverse many of these territories. Each word in it has been subjected to a "cultural edit", to ensure the definition will function without offence in all territories. "We want somebody in Milwaukee to recognise it as their language, and those in Manchester, Melbourne or Mumbai to think the same," says Kathy Rooney, with the alliterative faculty of a heavyweight lexicographer.
EWED's definition of the "n-word", for instance, contains stronger warnings of its potential offensiveness than any comparable dictionary. "A highly offensive taboo term," it reads. Below, a text panel has this to add: "Racism trap: This term is arguably the single most offensive racist slur in the English language. The fact that African-Americans and some people of colour sometimes use this word in reference to themselves does not excuse its use by members of other ethnic groups."
Nigel Newton writes in his Foreword to the dictionary, "English has become the preferred language of communication in the same way that so many propositions that have been around for a long time suddenly achieve widespread acceptance, in the same way as the idea that the Earth orbits the Sun, rather than the Sun orbiting the Earth, gained currency during the late 17th century."
Sure enough, a few pages later there's a diagram representing the extent of the language in a heliocentric pattern. At its central point, there's a disc bearing the words "WORLD ENGLISH" in authoritative capitals. Orbiting the centre like a ring of planets are its main forms: American English, Canadian English, Caribbean English, African English, South Asian English, East Asian English, Australian and New Zealand English, British and Irish English. Beyond these lies the asteroid belt: Maori English; Jamaican, Patwa, Trinidadian, Bajan, Inuit English, Sri Lankan English, Hawaiian English, Scots, Ebonics. A linguistic cosmology, with Bloomsbury's new lexicon at its burning heart.
Newton's Foreword also contains an optimistic projection of the relationship between World English and world diplomacy. "The e-mails from ordinary citizens in Belgrade that currently appear on the CNN nightly news may not have ended the war in Yugoslavia, but they certainly contribute to an understanding of the perspective of both sides in a way which could not have happened in any previous world conflict. These e-mails are written in English." EWED, it should be noted, calls the disputed territory, Kosovo - the Serb name for the region - rather than Kosova, its Albanian title.
When Nato first began to bomb Belgrade, I began an e-mail correspondence with Milos, a young Serbian computer programmer who was a contributor to an Internet discussion group on the crisis. It started out as an attempt to build bridges. Three weeks into the war, however, it had become a form of textual cluster-bombing.
Milos did his best to argue rhetorical points in a language he hadn't really mastered. He suggested that CNN news footage of the refugee exodus had been doctored. "You see a group of Albanians is crossing the border," he wrote. "But you can also see snow in the report ... And on another report they are almost hot. There is no snow at all. Funny!"
I tried to pursue arguments that seemed pretty reasonable to me, but which he only found more and more galling. I was talking nonsense, he argued. The Yugoslav army were attempting to suppress a nascent gangster state: "They want to be indipendent beacuse Alabania is not country with roads and railways. Our roads and railways can help their mafia. from kosovo they can make bussnies to all countries, better than from Albania. The terrorists here don't have enimies, and in Albania they have many enimies. So their buissnis will be safer in Kosovo ... "
Now the war is over, I've tried contacting him again, but he doesn't reply. Power cuts, I hope. Or maybe he wants to stick to the Serb language, to shore up his bombed-out self-esteem. It's good to talk, I suppose, but my e-mails to Milos seem to suggest that although World English might have the power to bring us together, it also has the potential to create a world divided by a common language.
The 'Encarta World English Dictionary' is published in print form by Bloomsbury on 4 August (pounds 30). The CD-Rom is published by Microsoft in mid-September