Dictating the future of personal computers: Improved voice recognition allows hands-off word processing

CAN'T TYPE? Tired of correcting spelling mistakes? Like to talk as you work? Help is now at hand, for those who don't mind speaking slowly. Last week IBM began shipping the Personal Dictation System, which allows you to enter text into a personal computer by speaking. Users dictate into either a hand-held or headset microphone, and the screen displays their words as they talk. The text can then be transferred into a number of standard word-processing packages.

The Personal Dictation System needs to be hooked up to at least a 486 personal computer. The computer also needs to be fitted with a Dictation Adapter, a speech card that converts analogue signals from the microphone into digital code. The basic system has a vocabulary of 32,000 words, with additional technical vocabularies available for journalists and various medical practitioners. Further vocabularies for lawyers and doctors are under development.

At present the system is on sale only in the US, at a cost of about dollars 1,000, but will be available in Europe later this year in UK English, Spanish, German, French and Italian versions.

Before the dictation system will work, each user has to train the computer to understand his or her voice by reading to it for 90 minutes. The program then builds a mathematical model of the individual's voice pattern to take account of accent and speech characteristics. When the user dictates into the machine, the speech waveform is digitised and matched with a library of word models.

This pattern-matching approach was rejected by early researchers into speech input systems in the 1960s in favour of rule-based artificial intelligence systems, because it requires huge computing power. Rapid increases in computer technology mean this power is now available on the desktop.

The system can cope with no more than 70 words a minute. Each word must be distinct, with a pause between each. Talking this way is an acquired skill and seems tortuously slow. However, non-professional typists rarely type accurately at this rate, and once the words are accepted there should be no spelling errors.

Other features include a Voice Action Editor, which enables users to create personal instructions. For example, a lawyer could produce a standard disclaimer. Whenever that paragraph has to be inserted in a letter, 'standard disclaimer' are the only words needed. The system can also be taught commands such as 'bold type' or 'new paragraph'. It understands all the commands in the computer's menu.

Speech input has been a long-term goal of computer scientists, but so far systems have been too slow and error-prone to gain widespread commercial acceptance, and have mainly been used by disabled people who could not type at all. Many other computer companies remain doubtful that speech recognition can be made accurate enough for widespread use. They also argue that the latest computer interfaces make machines so easy to use that speech input is irrelevant.

But IBM says the Personal Dictation System will be particularly useful for those who need to use their hands while working. They will be able to dictate instructions or reports at the same time. For example, a radiologist could report on a series of X-rays by speaking into the headset microphone while examining the film. The system could also be used by people who have suffered repetitive strain injuries using computer keyboards.

IBM's confidence in the Personal Dictation System is partly based on a similar system that has been available on its workstation computers for the past year. The company says that more than 70 software companies are committed to developing applications based on its speech technology.

A series of speech recognition products will be launched in the next few months. Elton Sherwin, the market development manager for speech recognition at IBM, says: 'What we can do today is already radically different from what we could do even two years ago.

'We only became comfortable with accuracy for double numbers like dollars 14.40, dollars 15.50 in July 1993. But perfecting the recognition of numbers will allow sophisticated financial management by phone or cable.' As a result, he believes, speech recognition systems will be available on interactive cable systems in the US within 18 months - for paying bills, playing games, ordering movies and so on.

Although these applications will require speaker independence - operating without first being trained to understand each voice, and with continuous speech capabilities - the vocabulary required will be very limited. IBM already has a continuous speech toolkit for developing applications which can be used with a 1,000-word active vocabulary chosen from a base of 20,000 words, and the next stage will be to adapt this for the commands needed to operate a cable TV system.

IBM is also testing the continuous speech system in collaboration with police forces, so that, for example, police officers can ask the computer in their car to search for a registration number while chasing a suspect, or request background information about someone they have detained. Other software companies are using the system to develop applications for casinos, court reporting, health care and financial services.

Start your day with The Independent, sign up for daily news emails
News
ebookA unique anthology of reporting and analysis of a crucial period of history
Life and Style
techPatent specifies 'anthropomorphic device' to control media devices
Voices
The PM proposed 'commonsense restrictions' on migrant benefits
voicesAndrew Grice: Prime Minister can talk 'one nation Conservatism' but putting it into action will be tougher
News
Ireland will not find out whether gay couples have won the right to marry until Saturday afternoon
news
News
Kim Jong-un's brother Kim Jong-chol
news
News
Manchester city skyline as seen from Oldham above the streets of terraced houses in North West England on 7 April 2015.
news
Latest stories from i100
Have you tried new the Independent Digital Edition apps?
Independent Dating
and  

By clicking 'Search' you
are agreeing to our
Terms of Use.

iJobs Job Widget
iJobs Money & Business

Guru Careers: Software Developer / C# Developer

£40-50K: Guru Careers: We are seeking an experienced Software / C# Developer w...

Neil Pavier: Management Accountant

£45,000 - £55,000: Neil Pavier: Are you looking for your next opportunity for ...

Sheridan Maine: Commercial Accountant

£45,000 - £55,000: Sheridan Maine: Are you a newly qualified ACA/ACCA/ACMA qua...

Laura Norton: Project Accountant

£50,000 - £60,000: Laura Norton: Are you looking for an opportunity within a w...

Day In a Page

Sun, sex and an anthropological study: One British academic's summer of hell in Magaluf

Sun, sex and an anthropological study

One academic’s summer of hell in Magaluf
From Shakespeare to Rising Damp... to Vicious

Frances de la Tour's 50-year triumph

'Rising Damp' brought De la Tour such recognition that she could be forgiven if she'd never been able to move on. But at 70, she continues to flourish - and to beguile
'That Whitsun, I was late getting away...'

Ian McMillan on the Whitsun Weddings

This weekend is Whitsun, and while the festival may no longer resonate, Larkin's best-loved poem, lives on - along with the train journey at the heart of it
Kathryn Williams explores the works and influences of Sylvia Plath in a new light

Songs from the bell jar

Kathryn Williams explores the works and influences of Sylvia Plath
How one man's day in high heels showed him that Cannes must change its 'no flats' policy

One man's day in high heels

...showed him that Cannes must change its 'flats' policy
Is a quiet crusade to reform executive pay bearing fruit?

Is a quiet crusade to reform executive pay bearing fruit?

Dominic Rossi of Fidelity says his pressure on business to control rewards is working. But why aren’t other fund managers helping?
The King David Hotel gives precious work to Palestinians - unless peace talks are on

King David Hotel: Palestinians not included

The King David is special to Jerusalem. Nick Kochan checked in and discovered it has some special arrangements, too
More people moving from Australia to New Zealand than in the other direction for first time in 24 years

End of the Aussie brain drain

More people moving from Australia to New Zealand than in the other direction for first time in 24 years
Meditation is touted as a cure for mental instability but can it actually be bad for you?

Can meditation be bad for you?

Researching a mass murder, Dr Miguel Farias discovered that, far from bringing inner peace, meditation can leave devotees in pieces
Eurovision 2015: Australians will be cheering on their first-ever entrant this Saturday

Australia's first-ever Eurovision entrant

Australia, a nation of kitsch-worshippers, has always loved the Eurovision Song Contest. Maggie Alderson says it'll fit in fine
Letterman's final Late Show: Laughter, but no tears, as David takes his bow after 33 years

Laughter, but no tears, as Letterman takes his bow after 33 years

Veteran talkshow host steps down to plaudits from four presidents
Ivor Novello Awards 2015: Hozier wins with anti-Catholic song 'Take Me To Church' as John Whittingdale leads praise for Black Sabbath

Hozier's 'blasphemous' song takes Novello award

Singer joins Ed Sheeran and Clean Bandit in celebration of the best in British and Irish music
Tequila gold rush: The spirit has gone from a cheap shot to a multi-billion pound product

Join the tequila gold rush

The spirit has gone from a cheap shot to a multi-billion pound product
12 best statement wallpapers

12 best statement wallpapers

Make an impact and transform a room with a conversation-starting pattern
Paul Scholes column: Does David De Gea really want to leave Manchester United to fight it out for the No 1 spot at Real Madrid?

Paul Scholes column

Does David De Gea really want to leave Manchester United to fight it out for the No 1 spot at Real Madrid?