VoiceXML

 

Application Page 2 of 7

There are many applications for VoiceXML like voice portal, voice enabled interacts and contact centers, notification services and innovative telephony services. Among these voice portal is one of the popular one. It is a telephone service where callers dial a phone number to retrieve information such as weather, sports, stock, entertainment,etc.

The following diagram elucidates the actual working of the architecture for a voice portal.

Figure 2.2: Components of a VoiceXML application (from voiceXMLreview.org)

A voice portal is an interface between a caller and the information source which is a web server. A voice portal can host a wide variety of information by integrating an interactive voice service or speech recognition system, web based data and VoiceXML. Users an get access to content from different sources via one PSTN number.

The voice portal requires the following components to perform its function.

1. Text to speech (TTS) and speech recognition to navigate between content source and provide user input.

2. The web server platform that accesses URL pages or HTTP data delivered.

A voice server resides between the phone and the HTTP server which interprets the VoiceXML documents and acts as a middleware processor between the HTTP server and the phone. The VoiceXML interpreter contains the voice recognition and sysnthesis engines used to automate the conversation between the site and the caller. Any website can be a VoiceXML content server. No special hardware or software is required.

Here's how it works:

A caller places a call to a designated phone number. A computer on the voice site answers the call and retrieves the initial VoiceXML script from a VoiceXML content server, which can be located anywhere on the web. An interpreter on the voice site parses and executes the script by playing the prompts, capturing the resonses, and passing the responses to a speech recognition engine on the voice system.

Once the script has all the necessary responses from the caller, the interpreter translates them into a request to the VoiceXML content server. When the server receives the request, it returns a VoiceXML page; with either a canned response or a dynamically generated VoiceXML scripts, containing information requested by the caller. Responses are passed from the web server to the Voice site via HTTP. The process can continue, simulating a natural language conversation between the caller and the VoiceXML server.