![]() |
| | #1 (permalink) |
| General Microsoft speech recognition software is much improved compared with the previous versions. It still fails, however, to meet the minimum requirements for professional dictation including adequate speech recognition accuracy and the availability of specialized vocabularies. The following is a specific critique of Vista Speech. It is understood that this is still beta software; however, this beta (build 5384) is supposed to contain the basic feature set of the production speech software. Audio Hardware Weaknesses 1. NT/Windows Server 2003 device drivers are typically incompatible with Vista. Very few sound converter Vista drivers are available at this time. Audio Input Window Strengths 1. Ability to select sound adapter. 2. Ability to select input type, that is: Microphone, line in, or digital. 3. Option for setting audio level manually. Weaknesses 1. It is unclear from the VU (volume unit) display where the optimum audio setting should be; for example, at the middle or upper limit of the green display. 2. What would appear to be the optimum audio setting on this display doesn't agree with that of the control panel, advanced speech options, speech recognition, microphone level window display. 3. The low end sensitivity of the VU is inadequate to show ambient electrical and acoustic noise levels. 4. There is apparently no automatic volume control. 5. There is no frequency spectrum display to provide relative indications of signal/noise amplitudes. Recognition Engine (Microsoft Speech Recognizer 8.0) Strengths 1. Recognition accuracy is significantly improved compared with previous Microsoft speech recognition engines. Weaknesses 1. Recognition accuracy is still far behind that of the current leading speech recognition software, NaturallySpeaking. 2. Speech recognition processing is slow, even on a high performance computer system. 3. Recognition accuracy is unusually sensitive to audio volume settings. 4. New words that are trained or existing words that are re-trained are frequently not then recognized correctly. This is one test of recognition accuracy. The Microsoft Speech Recognizer does poorly in this test compared with NaturallySpeaking. Speech Recognition Training Window Weaknesses 1. Text to be dictated is displayed in short sentences or sentence fragments rather than paragraphs. This is very upsetting to the normal pacing of dictation. 2. There is no indication of progression of the dictation; for example, highlighting or graying out of text as successive words are recognized. 3. There is no indication of the successfulness of the dictation. You can dictate phrases that are completely different from the displayed text and the program proceeds to the next display without any indication of there having been a recognition problem. 4. There is no ability to back-up, repeat or skip mis-recognized text. 5. There is no VU display in the training window. 6. There is no user selectable list of choices for additional training after the introductory training. 7. There are no user selectable specialized training texts; for example, business letters or medical reports. Dictation Strengths 1. Full-capability dictation into many application programs. 2. Ability to pop-up the correction window by speaking "correct" followed by the phrase to be corrected or by highlighting it and commanding "correct that", meaning correct the highlighted text. 3. "Scratch that" meaning to delete the most recently dictated phrase. 5. Various commands for navigating through text. Weaknesses 1. There are limitations of dictation into some "Windows standard" textbox controls. 2. There is no control key selection of command or dictation modes. 3. No microphone on/off by control key press. 4. No control key press for selection of post dictation spelling and grammar checks. 5. No option for vocabulary switching. 6. No user selectable, context sensitive control of abbreviations and number formatting. Correction Window Strengths 1. Correction window can be displayed by highlighting or voice selecting the text to be corrected. 2. Errant phrases are numbered if more that one instance appears in the text facilitating selection of a specific phrase to be corrected. 2. The lists of alternate phrases contain generally appropriate possibilities. 3. Additional alternates can be displayed by re-dictating the errant phrase. 4. Voice spelling of a new term is well designed and usually works properly. Weaknesses 1. No user option to re-train a mis-recognized word that is in the main vocabulary. 2. No way to type in a new word - it must be voice spelled. 3. No way to re-check the accuracy of a mis-recognized word that has just been re-trained. 4. No way to train a phrase ( as opposed to a single word). 5. No way to re-train both the corrected word or phrase and the original mis-recognized phrase. 6. No way to specify and train both actual spelling and "spoken as" representations of words or phrases. Vocabulary Weaknesses 1. Entries are limited to single words. 2. No way to specify and train both actual spelling and "spoken as" representations of words or phrases (as above). 3. No capability to display, search, sort, edit, add, delete and train any word or phrase in the main vocabularies. There is limited editing capability for user the vocabulary only. 4. No current availability of specialized vocabularies; for example, legal or medical. 5. No user option for adding specialized vocabularies. Utilities Weaknesses 1. No option for backup and restore of user (training, options, etc) and vocabulary files. 2. No option to add, delete, edit and execute user developed macros. 3. The option for processing "typical" user documents is limited to files stored in My Documents. The user has no control over the choice of directories or the specific files to be screened. 4. Typical document screening lacks the important functions of identifying and listing by frequency of occurrence the new words that are located in the documents. There are no user options for adding and training the new words. 5. Testing of the document screening from a custom SDK function did not result in any improvement in recognition accuracy. SDK/SAPI 5.3 Strengths 1. Extensive set of APIs. Weaknesses 1. No backward compatibility with SAPI 4. 2. SAPI 5.3 is only available for the Vista platform. Microsoft has decided, at least as of this date, not to supply a NT/Windows Server 2003 SAPI 5.3 based SDK. 3. The automation versions of many of the SAPI 5.3 functions are still not available. Some may never be implemented. 4. There are still significant bugs in multiple Microsoft provided SAPI 5.3 based automation functions and utilities. | Guest
Posts: n/a
|
|
| | #2 (permalink) | |
| You know what to do then, turn those results into reports: For Feedback: http://go.microsoft.com/fwlink/?linkid=55160 Feedback reporting tool: http://go.microsoft.com/fwlink/?linkid=43655 -- -- Andre Windows Connected | http://www.windowsconnected.com Extended64 | http://www.extended64.com Blog | http://www.extended64.com/blogs/andre http://spaces.msn.com/members/adacosta "Robert Robinson" <robbiex@bellsouth.net> wrote in message news:erzHPaZjGHA.4776@TK2MSFTNGP05.phx.gbl... Quote:
| Guest
Posts: n/a
| |
|
![]() |
| Tags |
| None |
| Thread Tools | |
| Display Modes | |
| |
| ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Question about performance and use of speech recognition on Vista | Quixotic1 | Windows Vista Performance & Maintenance | 0 | 02-07-2007 11:02 PM |
| Printing Vista Speech Recognition Command List | Peter | Windows Vista Printers & Scanners | 0 | 06-28-2006 03:27 PM |
| speech.dll | lawerence | Windows XP Embedded | 1 | 06-08-2004 08:51 AM |
| speech | Perdita X. Nitt | Windows XP Basics | 1 | 08-16-2003 05:57 PM |
| Windows XP and Speech | Have Pity and Help me | Windows XP | 0 | 07-10-2003 09:36 AM |