Other Problems
Sometimes, callers say too much. They forget that they are talking to a machine, and speak as if they are talking to a human who can understand their every word. Although dictation systems can translate long sentences to text, the system may not "understand" the meaning of the text. Callers may need to be encouraged to answer the question and to not volunteer additional information. For example:
System: "Color?" (pause) "Say the color you want." (pause) "Green, red, or blue?"
Caller: "I think I like green better than red or blue."
This phrasing confuses the speech-recognition system, which hears the words "green," "red," and "blue," and has no idea which color the user wants. Encourage the user to answer the question simply and directly:
System: "I'm sorry, I didn't understand you. Just say the color you want: green, red, or blue."
The key to successful speech data entry programs is careful dialog design and iterative usability testing. The best practice techniques for dialog design can both accelerate the data entry process and make entering data by speaking into a phone more enjoyable. But there are no guarantees. Developers must test with a vengeance, iteratively modifying the dialog, and verifying that the modification does indeed improve the process.
NOTE
For additional suggestions on how to improve the quality and experience of using telephony applications, see the author's book, VoiceXML: An Introduction to Building Voice Applications (Prentice Hall, 2002, ISBN: 0130092622).