- VoiceXML Enables Telephony Applications
- SALT Enables Telephony and Multimodal Applications
- Two Approaches: Which Is Best?
Two Approaches: Which Is Best?
It depends...
Programmers endlessly debate the advantages and disadvantages of declarative and procedural languages. But the key fact is that programmers now have a choice. They may choose between the declarative or procedural approaches based on the needs of the application, as well as the developer's skill set.
For traditional telephony applications, VoiceXML enables developers to create system-directed dialogs, in which the application prompts the caller to say requested parameter values. This style of interface is well-suited for telephone callers, who have no visual screen from which to select options. By using some clever programming techniques, system-directed dialogs can also be used as mixed-initiative dialogs, in which the user can deviate from the system-directed imposed dialog structure. There are a variety of VoiceXML tools available, including GUI tools that generate VoiceXML code, VoiceXML editors, debuggers, and test result generators. There is a well-established infrastructure for hosting and deploying VoiceXML applications.
For complex telephony applications, developers may find it necessary to override the Forms Interpretation Algorithm. In these situations, it may be better for developers to not use the standard Forms Interpretation Algorithm, and write their own control and synchronization code. Until the VoiceXML language provides a programming interface, developers should use SALT tags embedded into a host language. A future version of VoiceXML 2.0 may have a programming interface, perhaps one that resembles the SALT tags.
SALT really shines when developers need to voice-enable GUI applications. Developers specify the graphical user interfaces using a host language such as HTML or XHTML, and add SALT tags to enable the application to speak and listen to the user.
CAUTION
Simply adding voice to an existing Web application may not improve the user interface. However, multimodal user interfaces may add value to new applicationsespecially applications involving handhelds, cell phones, and other portable devices. (See article 4 in this series, Should You Build a Multimodal Interface for Your Web Site?, which presents suggestions for when to add new input modes to an application.)
SALT will be especially useful when developing point and speak applications. These applications are ideal for small devices that do not have a QWERTY keyboard for alphabetic data entry. The caller points to a labeled slot on the screen, and speaks the word or phrase, which is converted into text and placed into the correct slot. These applications can be downloaded into a PDA or executed on a server wirelessly connected to a cell phone with a screen and stylus.
Will SALT replace VoiceXML? No. VoiceXML is a mature technology being standardized by the W3C. It is widely deployed and used by thousands of programmers. Almost every VoiceXML vendor has a collection of dialog modules or speech objects that enable programmers to reuse existing VoiceXML code for common tasks such as soliciting a date, credit card number, or dollar amount. VoiceXML is here to stay.
However, SALT will become the new hot technology for developing multimodal user interfaces, especially for devices that are so physically small that there is no room for a traditional keyboard. Developers will also insert SALT tags into a host programming language to develop complex telephony applications that go beyond today's typical "ask a question and listen for the user's response" cycle of dialog.