Network: It's, um, English, just like we really speak it: Using an immense data base, lexicographers have taken raw language and produced a revolutionary new dictionary. Robert Nurden reports

The unexpurgated gossip of dinner ladies from Hackney, the negotiations of a company director from Newcastle, and the cries of croquet enthusiasts from Bromley have all helped to create a revolutionary dictionary.

Their spontaneous chit-chat is part of the Spoken Corpus project, devised by Longman, the publishing company. The project involved creating a data base of 10 million words taken directly from everyday situations - the largest ever compiled in English.

About 150 volunteers agreed to wear a tape recorder for up to two weeks so that all their conversations could be recorded. The result: tapes which, if joined up, would be 34 times the height of Mount Everest.

The tapes were transcibed to produce on disk the world's largest data base of spoken English. Lexicographers then turned this English in the raw - much of it extremely rich - into material for dictionaries.

The Spoken Corpus's first off- spring is the Longman Language Activator, described by a leading grammarian, Professor Sir Randolph Quirk, as 'the book the world's been waiting for'. The Activator is a dictionary for advanced students of English that gives not only a word's definition - as monolingual dictionaries do - but points the reader to related words or phrases as they are actually used. It offers a far wider range of such phrases than a standard thesaurus. For instance, the word 'lucky' leads to 'fall on your feet', 'not know you're born' and even 'keep your fingers crossed'.

Technology has not merely enabled the lexicographers to work more efficiently and accurately, but also to help dictionary compilers to trace the ebb and flow of new expressions in the language - which phrases are taking root and which are disappearing. Previously, they had to guess.

Search techniques also enable lexicographers to test frequency of usage. The data base reveals the words that are favourites in speech but infrequent on the page. The word 'really', for instance, is used five times more often in speech than in writing. The search also shows that women speak in different ways from men, and reveals important details about regional and class speech patterns.

Electronic corpora - the word corpus refers to the fact that this is a collection of words - enable specialists to home in on categories of language that interest them: social science, legal terminology, physics, geography, poetic usage and so on.

Linguists have long known about the importance of phatic communion - noises and pauses, that we use to express doubt, joy, fear, aggression, to play for time or be just plain pig-headed. Um and ah, and even suckings of teeth or intakes of breath, are highly subtle vocal devices that can now be analysed in depth.

The Spoken Corpus is part of the British National Corpus, a collaborative venture between universities and educational bodies that has produced more than 100 million words, 90 million of them written. The Corpus has already changed the way textbooks for foreign students are put together, and has helped Longman to produce the first multimediaCD-rom dictionary. In future, monolingual dictionaries, which are devoted to helping people whose first language is English, will also contain real examples of spoken English, rather than invented ones. The days of dry academics poring over file cards in dusty research rooms have long gone: the dinner ladies from Hackney have seen to that.

Oxford University Computing Services, which is handling sales of the Spoken Corpus disks, have not yet put a price on them. But the Corpus should become available in the next few weeks to linguists, lexicographers and compilers of English teaching materials.

Club legend Paul Scholes is scared United could disappear into 'the wilderness'
A model of a Neanderthal man on display at the National Museum of Prehistory in Dordogne, France
Dawkins: 'There’s a very interesting reason why a prince could not turn into a frog – it's statistically too improbable'
newsThat's Richard Dawkins on babies with Down Syndrome
Arts and Entertainment
Eye of the beholder? 'Concrete lasagne' Preston bus station
architectureWhich monstrosities should be nominated for the Dead Prize?
Have you tried new the Independent Digital Edition apps?
Life and Style
ebooksA superb mix of recipes serving up the freshest of local produce in a delicious range of styles
Life and Style
ebooksFrom the lifespan of a slug to the distance to the Sun: answers to 500 questions from readers
Dinosaurs Unleashed at the Eden Project
Arts and Entertainment
Life and Style
This month marks the 20th anniversary of the first online sale
techDespite a host of other online auction sites and fierce competition from Amazon, eBay is still the most popular e-commerce site in the UK
Travel Shop
the manor
Up to 70% off luxury travel
on city breaks Find out more
Up to 70% off luxury travel
on chic beach resorts Find out more
sardina foodie
Up to 70% off luxury travel
on country retreats Find out more
Latest stories from i100
Have you tried new the Independent Digital Edition apps?
Independent Dating

By clicking 'Search' you
are agreeing to our
Terms of Use.

ES Rentals

    iJobs Job Widget
    iJobs General

    Quantitative Analyst (Financial Services, Graduate, SQL, VBA)

    £45000 per annum: Harrington Starr: Quantitative Analyst (Financial Services, ...

    Application Support Engineer (C++, .NET, VB, Perl, Bash, SQL)

    Negotiable: Harrington Starr: Application Support Engineer (C++, .NET, VB, Per...

    C# .NET Software Developer (Client-Side, SQL, VB6, WinForms)

    Negotiable: Harrington Starr: C# .NET Software Developer (Client-Side, SQL, VB...

    C# Developer (Genetic Algorithms, .NET 4.5, TDD, SQL, AI)

    £40000 - £60000 per annum + Benefits + Bonus: Harrington Starr: C# Developer (...

    Day In a Page

    Middle East crisis: We know all too much about the cruelty of Isis – but all too little about who they are

    We know all too much about the cruelty of Isis – but all too little about who they are

    Now Obama has seen the next US reporter to be threatened with beheading, will he blink, asks Robert Fisk
    Neanderthals lived alongside humans for centuries, latest study shows

    Final resting place of our Neanderthal neighbours revealed

    Bones dated to 40,000 years ago show species may have died out in Belgium species co-existed
    Scottish independence: The new Scots who hold fate of the UK in their hands

    The new Scots who hold fate of the UK in their hands

    Scotland’s immigrants are as passionate about the future of their adopted nation as anyone else
    Britain's ugliest buildings: Which monstrosities should be nominated for the Dead Prize?

    Blight club: Britain's ugliest buildings

    Following the architect Cameron Sinclair's introduction of the Dead Prize, an award for ugly buildings, John Rentoul reflects on some of the biggest blots on the UK landscape
    eBay's enduring appeal: Online auction site is still the UK's most popular e-commerce retailer

    eBay's enduring appeal

    The online auction site is still the UK's most popular e-commerce site
    Culture Minister Ed Vaizey: ‘lack of ethnic minority and black faces on TV is weird’

    'Lack of ethnic minority and black faces on TV is weird'

    Culture Minister Ed Vaizey calls for immediate action to address the problem
    Artist Olafur Eliasson's latest large-scale works are inspired by the paintings of JMW Turner

    Magic circles: Artist Olafur Eliasson

    Eliasson's works will go alongside a new exhibition of JMW Turner at Tate Britain. He tells Jay Merrick why the paintings of his hero are ripe for reinvention
    Josephine Dickinson: 'A cochlear implant helped me to discover a new world of sound'

    Josephine Dickinson: 'How I discovered a new world of sound'

    After going deaf as a child, musician and poet Josephine Dickinson made do with a hearing aid for five decades. Then she had a cochlear implant - and everything changed
    Greggs Google fail: Was the bakery's response to its logo mishap a stroke of marketing genius?

    Greggs gives lesson in crisis management

    After a mishap with their logo, high street staple Greggs went viral this week. But, as Simon Usborne discovers, their social media response was anything but half baked
    Matthew McConaughey has been singing the praises of bumbags (shame he doesn't know how to wear one)

    Matthew McConaughey sings the praises of bumbags

    Shame he doesn't know how to wear one. Harriet Walker explains the dos and don'ts of fanny packs
    7 best quadcopters and drones

    Flying fun: 7 best quadcopters and drones

    From state of the art devices with stabilised cameras to mini gadgets that can soar around the home, we take some flying objects for a spin
    Joey Barton: ‘I’ve been guilty of getting a bit irate’

    Joey Barton: ‘I’ve been guilty of getting a bit irate’

    The midfielder returned to the Premier League after two years last weekend. The controversial character had much to discuss after his first game back
    Andy Murray: I quit while I’m ahead too often

    Andy Murray: I quit while I’m ahead too often

    British No 1 knows his consistency as well as his fitness needs working on as he prepares for the US Open after a ‘very, very up and down’ year
    Ferguson: In the heartlands of America, a descent into madness

    A descent into madness in America's heartlands

    David Usborne arrived in Ferguson, Missouri to be greeted by a scene more redolent of Gaza and Afghanistan
    BBC’s filming of raid at Sir Cliff’s home ‘may be result of corruption’

    BBC faces corruption allegation over its Sir Cliff police raid coverage

    Reporter’s relationship with police under scrutiny as DG is summoned by MPs to explain extensive live broadcast of swoop on singer’s home