Have you talked to your PC lately?
With the latest speech recognition packages, a computer can develop an acute ear for an individual's voice and deliver impressive results - and all for less than pounds 100.
Tuesday 13 May 1997
Indeed, there are now three surprisingly good speech-recognition systems available for under pounds 100, from IBM, Dragon Systems, and Kurzweil. Previously such packages cost pounds 500-pounds 1,000.
Generally, dictation is carried out slowly, with a pause between each word. The initial temptation is to speak very ... slowly ... indeed, but once you trust the computer to understand your words most of the time, you find you can speak reasonably fast, with only very short pauses between the words. It is certainly a lot faster than most of us can type.
According to an IBM survey, only 15 per cent of PC users can touch-type. Even for a reasonable typist anything more than 30 words per minute is good, which is less than you might achieve from any of the speech-recognition systems straight from the box. Once user and system are used to each other, they should easily do 60 to 80wpm, climbing to 120wpm or more for regular users with a set vocabulary. The benefits for some professions who have to dictate reams of notes, such as lawyers and pathologists, are considerable, especially as there are upgraded versions of the various systems available with their specialist vocabularies.
But it takes time to get the best out of these packages. There are systems that can recognise virtually any voice without training, such as those used for automated telephone queries, but they use very limited vocabularies and need powerful computers.
Training can be frustrating at first, as the system tries to learn your speech patterns and often picks the wrong word. Each time it does you must correct it, so it can learn. It should take only one correction for it to recognise a word from then on - provided you say the word the same way each time.
Some of the early results can be laughable. Occasionally it becomes annoying. But any stress (or laughter) in your voice makes it even more likely to misbehave. You have to remain calm and speak slowly and evenly. It eventually becomes relaxing, especially as you do not need to strain over a hot keyboard any more. Having had repetitive strain injury, I am used to aching wrists and shoulders, which is why having a listening PC is such a boon.
To get the best from each package requires a fast Pentium PC, preferably with lots of memory, and some time spent in setting up the microphone correctly (as that can make a big difference to recognition accuracy). Some soundcards (such as the Ensonic on my Gateway 2000 PC) will not work with the headset microphones supplied, so check with the dealer in advance if you use anything other than a Creative Labs Soundblaster 16 soundcard.
Besides dictating, Solo also allows you control applications, so you can command your PC by voice instead of using the mouse and keyboard, which makes it the best bet for hands-free use. This version will only work with one of these Microsoft applications (your choice): Word, Excel, PowerPoint, Access or Internet Explorer. But it also works with Netscape Navigator (version 3.0 plus), Adobe Acrobat Reader, Microsoft Exchange and Windows 95 accessories such as WordPad, Calculator or Solitaire. So if you need it most for Excel, you can at least dictate into WordPad and surf the Net with Navigator. Solo opens as a floating toolbar above your application, with pop-down menus as required.
It has a 120,000-word vocabulary (10,000-word active dictionary), and with very little use was delivering about 92 per cent accuracy (with most of the errors of the 2/to and 4/for variety). To reduce errors further, you can alter the settings so it changes words depending on the context, replacing "to big" with a more likely "too big", although that affects speed.
As the word history (or "Ooops Buffer") can remember up to 32 words, you do not have to keep stopping if you are in full flow. However, it can get frustrating trying to remember what commands you can say (although there is a "what can I say" function which will show you, but it could be simpler, and a quick reference card with the most-used functions).
Like its rivals, more documentation (including an extended tutorial) would be helpful, although the help files are generally good. It also has the best on-screen tutorial, with lots of demonstrations and a cartoon dragon roaring flames. Besides improving your familiarity with it, it also learns more about the way you speak, which makes your first words look less like baby-talk.
Conclusion: Best choice for disabled users, and good all-round choice for most users. The only choice at this price if you want to do more than dictate. Good recognition, average ease of use, good tutorial, average documentation. More features than you will probably ever use.
Dragon Dictate Solo, pounds 79.99, on CD-Rom. Endeavour Technologies (01932 827324 www.endeavour.co.uk). System requirements: Windows 3.1 or 95, a 486 DX4/66 (Pentium recommended), at least 16Mb RAM, Soundblaster 16 or compatible soundcard.
IBM VoiceType Simply Speaking
This system offered the best instant speech recognition. Straight from the box it recognised 85 per cent of words, or more, provided they were already in its 30,000 word vocabulary (you can add a further 27,000 words of your own). Once more familiar, it was getting up to 96 per cent right. Like its rivals it should improve further with use.
Its big advantage is that you can speak without having to stop to correct words as you go along; useful if you are reading, as you do not have to keep glancing from paper to screen. Instead, it also records your speech, so that when you click on the word you hear what you said. This allows you correct everything in one batch (via the keyboard), although it won't save the audio for later playback. But if it is crucial that what you say is completely accurate, this method does make you more likely to miss an error in a long document, unless you keep glancing at the monitor.
If you do look at the screen, you will notice that it often displays a couple of words before alighting on its choice. That is because it uses a certain amount of intelligence in relating each word to those around it, to try to give a good fit. Another reason its test results were so good may be that with the smallest vocabulary, there is less likelihood of it getting words mixed up, but as most people use only a few thousand words regularly, that may not matter so much.
The very basic word processor it comes with may be adequate for some users and it can save in Word 6.0 or RTF format, but if you want to integrate it with other Windows applications you have to upgrade to the pounds 650 VoiceType Dictation version, which also gives command and control, allows you work with macros and do your corrections (or have your secretary do them for you) later. It even lets you dictate on to a (digital) tape or mini-disc and plug that in to the computer, bypassing the Dictaphone typist, although you still have to speak slowly.
Conclusion: Best choice for dictators, especially if you need to input lots of text. Worst choice for disabled or if you don't want to use the keyboard. Very good recognition, good ease of use, average (limited) documentation, limited features, limited vocabulary.
IBM VoiceType Simply Speaking, pounds 89, on CD-Rom. IBM (01705 492 249, www.software.ibm.com/workgroup/voicetyp) System requirements: Windows 95, a 100MHz Pentium, at least 16Mb RAM, Soundblaster 16 (or 100 per cent compatible) or IBM Mwave soundcard.
Kurzweil VoicePad Pro
The simplest of the three to get running. Its "active words" window displays what you can say in any situation, which is handy when you cannot remember a command or how it likes its punctuation marks (its use of "period" instead of "full-stop" displays its US origins, although you can add your own commands).
Word correction is relatively easy, although having to move around the screen to correct earlier words is not as simple as Solo's "Ooops" command. However, when it misunderstood a "correct-that" (commands are said as one word) and then heard me say "take-2" instead of "take-4" (the commands to choose a displayed correction), I ended up deleting a word ("delete- that") every time I wanted to correct it. Unfortunately, the training module did not offer "correct-that" as a choice (just as a subject heading). But, delving further brought up the Recognition Wizard, which deals simply with such problems as "confuses a pair of words", but it should tell you that you can type in the words instead of scrolling through the 200,000 words and 7,700 commands in its vocabulary. Although only 30,000 words can be active at any time, having so many to call on means that once it is used to your voice, its recognition can be very good.
However, straight from the box it was about 73 per cent accurate. That rose quickly to about 85 per cent after initial training, and was above 90 per cent in a few days. An add-on, Talk Commands, which includes a wide range of useful (UK-oriented) macros and commands makes it easier to use. It is included in a pounds 92 package with a high-quality microphone which helps to improve recognition, especially in noisy environments.
Conclusion: Best choice for occasional users. Good for disabled users. Nice, straightforward, design. Good recognition, good ease of use, good documentation, good features.
Kurzweil VoicePad Pro, pounds 93, on CD-Rom. Talking Technologies (0171 602 4107, www.talk-systems.com) System requirements: Windows 3.11 or 95, a 486 DX4/75 (Pentium for Windows 95), at least 16Mb RAM, Soundblaster 16 or Windows-compatible 16-bit soundcard.
Weather bomb in pictures: Storms cuts power for tens of thousands – and snow is on the way
Jessica Chambers: 19-year-old woman 'doused with lighter fluid and burned alive' in the US
Russell Brand calls Nigel Farage 'poundshop Enoch Powell' in BBC Question Time debate
Russell Brand was rendered speechless on Question Time by this man
Fury at Airbus after it hints the super-jumbo may be mothballed
- 2 Harry Potter fans can apply to the Hogwarts-inspired College of Wizardry
- 3 Jessica Chambers: 19-year-old woman 'doused with lighter fluid and burned alive' in the US
- 4 Russell Brand calls Nigel Farage 'poundshop Enoch Powell' in BBC Question Time debate
- 5 Orange Wednesdays are no more
£25000 per annum: Ashdown Group: An established media firm based in Surrey is ...
£25,000 to £35,000: Sphere Digital Recruitment: The Company Our client are th...
£80 – 120K : Sphere Digital Recruitment: Sales Director – Ad tech - £80 – 120K...
40,000- 50,000: Sphere Digital Recruitment: Senior Analyst – Global Sports Gam...