Fast and easy way to cut a long story short
Summarising a text is a key human skill. But can software do it? Roderick Neil Kay looks at progress
Tuesday 29 July 1997
The ability to summarise a text is founded on core intellectual skills, so much so that the trials in the task still appear in numerous IQ and recruitment tests, including those run for the elite of the Civil Service.
Like many other products involving natural language, the new summarising technology doesn't quite live up to its billing. Summarisers have now been developed by a host of companies, including Oracle, InXight and BT, all based on similar statistical techniques. The simplest such method runs as follows. First, find the most frequently occurring words in a text, excluding trivial words. Second, locate the sentences in which groups of these words occur. Third, extract these sentences and compile them in chronological order. Then describe the compilation as a summary, without blinking.
One of the most surprising things about the statistical techniques at the centre of the new software is that the ideas have been around for a long time, almost before the dawn of AI, in fact. As early as 1958, while working for IBM, Luhn ran some experiments on a corpus of technical articles, using the algorithm just described. He was enthusiastic about the results: "The auto-abstract is perhaps the first example of a machine- generated equivalent of a completely intellectual task in the field of literature evaluation."
But at the time, text retrieval wasn't the hot topic it is today, and his idea was never marketed in the form of software. The view within the AI community, which has always aimed at getting computers to understand language, has been that statistical techniques are OK as far they go, but if you consider what a human can do, that isn't very far. The emergence at this point of the new summarisers probably owes as much to bumped-up demand as it does to advancement in the field.
While the recent crop of summarisers fall reassuringly short of human performance, their wide availability should generate the kind of interest which leads to improvement. And anyway, enough of human condescension; let us allow the computer to speak - or rather to summarise - for itself. The following extract is a 20 per cent summary of this article, produced by BT's Netsumm.
"Automatic summarising has long been considered one of the most prized goals in artificial intelligence, but working summarisers have now finally appeared based on far more superficial techniques. Summarisers have now been developed by a host of companies not normally associated with text processing: Oracle, InXight and British Telecom, all based on similar statistical heuristics. The view within the AI community, which has always aimed at getting computers to understand language, has been that statistical techniques are OK as far they go, but if you consider what a human can do, it isn't very far"n
Life & Style blogs
Who is Teresa Fidalgo? Debunking the fake ghost story that's got Instagram spooked
Scottish salmon sales leap as Asia develops a taste
Grim second life of the 'breastaurant': The oft-loathed sector is booming in the States thanks to Hooters, Twin Peaks and Tilted Kilt
Health: When masturbation can be fatal: The practice of auto-erotic asphyxia is often concealed by a coroner's verdict. Monique Roffey looks at a lethal taboo
British actor Idris Elba cannot star as James Bond because he is black, says shock jock Rush Limbaugh
Millions of Britons struggling to feed themselves and facing malnourishment
Ukip member gets into Christmas spirit with Union Flag plea to Santa 'for our country back'
Germany anti-Islam protests: 17,000 march on Dresden against 'Islamification of the West'
Nigel Farage: Ukip leader named 'Briton of the year' by The Times
Immigrants make UK racist, says Ukip councillor Trevor Shonk
- 1 President of Argentina adopts Jewish godson to 'stop him turning into a werewolf'
- 2 ALS ice bucket challenge co-founder Corey Griffin drowns, aged 27
- 3 The 'Black Museum': After 150 years, public set to see exhibits from police’s grisly crime museum
- 4 AirAsia flight QZ8501 missing: Plane carrying 162 passengers from Indonesia to Singapore disappears over Java Sea
- 5 Naomi Wolf reacts to Isis 'conspiracy theories' critism after she questions whether beheading videos are real
£20000 per annum: Recruitment Genius: Panel Wireman required for small electro...
£25000 - £27000 per annum: Recruitment Genius: An SME based in East Cheshire, ...
£18000 - £22000 per annum: Recruitment Genius: Do you have previous experience...
£16000 - £18000 per annum: Recruitment Genius: This is an exciting opportunity...