Introduction to Voice XML Part 4: Grammars, Scope, and Event Handlers
Welcome back! The previous article in this series looked at how grammars can be used with forms to build simple voice applications. To summarize our explorations so far, you’ve learned the following:
- Forms and menus are the basic building blocks of a Voice XML application.
- Grammars can be associated with the fields that make up a form.
- Input that matches a grammar is used to populate form variables.
- The <nomatch> element can be used to define and override an application-specific response to input that does not match a grammar.
- The <noinput> element can be used to define and override an application-specific response to lack of input.
Understanding these basics enables you to build useful and functional voice applications. However, there are times when you need to go beyond these basics to deliver more complex, robust, and responsive voice applications.
For example, you may want to provide general as well as context-specific help messages or allow experienced users to interrupt dialogs and jump to specific options without being prompted. To accomplish these things, we need to look at several techniques that will enhance the user experience.
Barging in with bargein
By default, Voice XML applications allow a user to barge in and interrupt a dialog with input that will advance them past one or more prompts. This ability to barge in is controlled by the bargein property of an application, which is great for experienced users who can move rapidly through prompts to get the information they want, bypassing the tedium of long messages and complex key sequences.
However, developers can selectively disable bargein—to ensure that a user listens to a complete informative message or advertisement. Additionally, bargein can be disabled to prevent spurious background conversation from triggering an active grammar.
Listing 1 illustrates how to disable bargein (line 4) and force the user to listen to an advertisement. Since bargein is turned off within a form in Listing 1, other dialogs that may appear within the document will not have bargein turned off. Turning bargein off at the document level will turn it off for all dialogs in the document.
Listing 1 Disabling bargein for a form
1 <vxml version="2.1"> 2 <form id="main"> 3 <!--inside the form we disable bargein for this dialog --> 4 <property name="bargein" value="false"/> 5 <!-- play an ad --> 6 <block> 7 <audio src="http://www.zorko.com/ads/paynow.wav"/> 8 </block> 9 <!-- dialog --> 10 <field name="city"> 11 <prompt>What city are you calling from?</prompt> 12 <grammar src="city.grammar"/> 13 </field> 14 </form> 15 </vxml>
When bargein is allowed, a user can bail completely out of one dialog and move rapidly to a completely different dialog. But to accomplish this, we need to have multiple grammars active so we can take action if any spoken or dtmf input matches.
To write applications that allow these kinds of jumps across dialogs, Voice XML supports scoping rules similar to those found in programming languages. In Voice XML, grammars can be positioned at the field, form, or document level. Additionally, we can react to input across a whole collection of documents by organizing multiple documents as an application.
Let’s first look at a simple example of how to accomplish this within a single document and then look at multiple documents.
Listing 2 illustrates grammar scope with a document that contains multiple dialogs. The example code makes use of the Voice XML link element, which enables us to define a target dialog and a grammar that defines the input that will trigger the dialog. The structure of a link element is the following, where next specifies the dialog to transition to:
<link next="#mydialog"> <grammar mode="voice"> <!--grammar goes here --> </grammar> </link>
In Listing 2, the link element appears as a child of the vxml element, giving it document scope. This means that within any menu or form contained in the document, a match against one of the link grammars will pass control to the target dialog.
This example also contains two link elements: one that will give us a quick transition to the baseball menu and another that will take us to the form with id="mets".
In Line 13, the link element includes a grammar for the single word baseball. Note that our example contains two link elements: one for transitioning to the baseball dialog (line 10) and another to transition to the mets dialog (line 18).
Also note that the link on line 18 contains two grammars: one voice and the other dtmf. Thus, if the user says "mets" or "New York mets" or presses 9 at any time, they will be transitioned to the form with id="mets".
Listing 2 Voice XML document with link grammars
1 <?xml version="1.0" encoding="UTF-8"?> 2 <!DOCTYPE vxml SYSTEM "http://www.w3.org/TR/voicexml20/vxml.dtd"> 3 4 <vxml version = "2.0" xmlns=’http://www.w3.org/2001/vxml’ 5 xmlns:xsi=’http://www.w3.org/2001/XMLSchema-instance’ 6 xsi:schemaLocation=’http://www.w3.org/2001/vxml 7 http://www.w3.org/TR/voicexml20/vxml.xsd’> 8 9 10 <link next="#baseball"> 11 <grammar mode="voice"> 12 <rule id="linkbaseball" scope="public"> 13 baseball 14 </rule> 15 </grammar> 16 </link> 17 18 <link next="#mets"> 19 <grammar> 20 <rule id="linkmets" scope="public"> 21 <one-of> 22 <item>mets</item> 23 <item>new york mets</item> 24 </one-of> 25 </rule> 26 </grammar> 27 28 <grammar mode="dtmf" > 29 <rule id="linkmets2" scope="public">9</rule> 30 </grammar> 31 32 </link> 33 34 35 <menu id="mainmenu" dtmf="true"> 36 <prompt> 37 Welcome to the info hotline. If you know the category you want, 38 you may say it at any time. 39 <enumerate> 40 For <value expr="_prompt"/>, press <value expr="_dtmf"/> 41 </enumerate> 42 </prompt> 43 44 45 <choice next="#sports">sports</choice> 46 <choice next="#weather">weather</choice> 47 </menu> 48 49 <menu id="sports"> 50 <property name="inputmodes" value="dtmf"/> 51 <prompt> 52 For baseball press 1, For football press 2, For soccer 53 press 3. 54 </prompt> 55 <choice dtmf="1" next="#baseball"/> 56 <choice dtmf="2" next="#football"/> 57 <choice dtmf="3" next="#soccer"/> 58 </menu> 59 60 61 <form id="weather">9 62 <block> You have reached the weather line. Whether it’s cold, 63 or whether it’s hot, we’re going to have weather, 64 whether or not. 65 </block> 66 </form> 67 68 <menu id="baseball"> 69 <property name="inputmodes" value="dtmf"/> 70 <prompt> 71 For yankees press 1, For mets press 2, For giants press 3. 72 </prompt> 73 <choice dtmf="1" next="#yankees"/> 74 <choice dtmf="2" next="#mets"/> 75 <choice dtmf="3" next="#giants"/> 76 </menu> 77 78 <form id="soccer"> 79 <block> you have reached the soccer hotline. 80 </block> 81 </form> 82 83 <form id="football"> 84 <block> you have reached the football hotline. 85 </block> 86 </form> 87 88 <form id="mets"> 89 <block> The mets are contenders for the world series. 90 </block> 91 </form> 92 93 <form id="yankees"> 94 <block> George Stein Brenner rules his roost. 95 </block> 96 </form> 97 98 <form id="giants"> 99 <block> Barry Bonds has now hot more home runs than Hank Aaron. 100 </block> 101 </form> 102 103 </vxml>
In the Listing 2 example, all the menus and forms are contained in the same document, so our positioning of the link element (with document scope) ensures that the link grammar is operational across all dialogs.
However, if our menus and forms were set up as separate documents, the shortcut document link grammar would no longer be available once we transitioned to another document.
To get around this problem so that grammars, variables, and properties remain visible across a range of separate documents, Voice XML supports the concept of application.