- The Microsoft Speech Application SDK (SASDK)
- Business Benefits of Speech
- How the Speech Engine Works
- Installing the SASDK
- Creating a Speech Application
- Debugging and Tuning a Speech Application
- Setting Up a Telephony Server
- Summary
Debugging and Tuning a Speech Application
The SASDK provides several tools that can be used to debug and fine-tune your application. These tools allow developers to simulate the user’s environment, which is important not only when developing the application, but during testing and deployment. The SASDK also includes logging and reporting tools that can be used to evaluate the impact and effectiveness of the application.
Speech Debugging Console
The Speech Debugging Console is a tool essential for building voice-only applications. It allows you to simulate the user’s experience while displaying important information. Figure 2.12 shows a screenshot of the tool.
Figure 2.12 Screenshot of the Speech Debugging console after it has processed a QA control. The SML tab contains the SML created by the speech recognition engine. Since text was used instead of real speech in this situation, a confidence score of 1.0, or 100 percent, was assigned.
The Options menu allows you to toggle on and off the following: Break on Listen, Break on DTMF, Play prompts, Edit SML, and Show/Hide other windows. When building your application you will typically have all these options turned on. Break on Listen and Break on DTMF are important because they allow you time to inspect results within the tabs without being interrupted. Otherwise, your delays would be interpreted as silences and certain events might be triggered.
The Speech Debugging console offers the developer useful debugging information. The Output tab is shown by default and will contain a stream of messages returned from the Web server. These messages are critical if you experience an error or unexpected result. You can trace through the output to understand the application’s dialog flow.
For each control that is activated, an entry is made inside the Activations tab. Whenever the RunSpeech engine tries to activate a control it places a node inside this tab. From here you can expand the nodes and determine the state of the semantic items associated with each activation step.
The SML tab shows the SML for the last semantic item processed. This can be useful when you are trying to determine why the speech engine did not recognize the grammar correctly. It will also show you the confidence level the item was recognized at. For the example in Figure 2.12, the confidence score was 1.000, or 100 percent. The SML tab also allows the developer to edit the SML output and therefore can be used to simulate different outcomes.
Telephony Application Simulator (TASim)
Available with the SASDK, TASim allows you to simulate the client experience for telephony applications. Where Speech Debugging Console is used to design and debug your application, TASim is used for testing and deploying it. You access TASim from the Debugging Tools submenu beneath the Microsoft Speech Application SDK 1.0 programs menu item. Go to File and Open URL to specify the http path to your application. Figure 2.13 is a screenshot of the Telephony Application Simulator.
Figure 2.13 Screenshot of the Telephony Application Simulator used to simulate the client experience.
When debugging telephony applications, TASim is the only way to enter DTMF input or numerical digits. The DTMF tab is not available when using the Internet Explorer Add-in client. In order to execute within TASim, a Web page must contain an AnswerCall control to initiate the call. With TASim, it is not necessary to install Speech Server in order to build, debug, and test telephony applications.
Analysis and Reporting
MSS provides two primary means for analysis and reporting:
Call Viewer—Allows the developer to analyze the results of one or more calls. This tool is generally used by developers to identify problem areas, such as the grammar or the confidence threshold.
Speech Application Reports—Built on Microsoft SQL Server Reporting Services, they are used to analyze data for multiple calls. Used by both developers and IT decision-makers, they include a few predesigned reports that anticipate common analysis needs.
Each call is logged to the Windows Event Trace Log and stored in a log file with an .etl (event log tracing) extension. This file is then imported into a SQL database using a prebuilt SQL Server Data Transformation Services (DTS) package. Both the Call Viewer application and the Speech Application Reports access the SQL database to analyze imported data.
Several utilities provided with the SASDK allow the developer to extract data from .etl files. In addition, the developer can install the speech application log analysis tools, available by default in the C:\SpeechSDK\Setup\Redistributable Installers\Microsoft Log Analysis Tools for Speech Application directory. This directory should have been created if you followed the instructions in the section titled "Installing the SASDK." After executing Setup.exe, you should be able to access the Call Viewer application by browsing to Microsoft Speech Application SDK 1.0 and Log Analysis Tools.