Certified Tech Tip: Alpha-Numeric voice recognition grammars - part two

May 20th, 2008 by matt

voicexmlcertifieddeveloper.gif
In our last entry to the tech-tips blog, we detailed the challenges inherent in capturing alphabetical, or alpha-numeric entries from our callers, and detailed several paths for minimizing the chance of mis-recognition when implementing input fields based on these two categories of voice recognition. The long and short of this posting was that IVR developers should refrain from attempting this wherever possible, and to instead try these alternatives:

* Pre-compiled Statistical Language Model grammars
* Leveraging TargusInfo services for advanced recognition accuracy

However, the IVR project requirements dictate what we can, and can’t do as developers, so in some cases, we have to try and whip out a user grammar that takes alpha, or alpha-numeric input. As mentioned in our last blog entry, there are a few things we can do to stack the deck to try and squeeze more accuracy out of these grammars so that we don’t end up with frustrated callers, but the plain truth is that we will never, ever be able to write a grammar that accepts alphabetical characters to be 100% accurate using todays recognition technology. What we will do today is twofold:

(1) Craft an SRGS+SISR subgrammar for alphabetical, and numeric characters

(2) Plug this grammar into a mixed-initiative form dialog that will minimize (but not fully address!), the possibility for mis-recognitions.

Those developers who have the need for such a grammar and dialog within their production-grade applications are advised to take this basic framework as a starting point, and then expand on it by:

(a) Test carefully with a broad range of users, and to fully flesh out alternate utterance values for alphabetic characters

(b) Apply item weighting to specific characteRs based on the probability of a given character versus another like-sounding character - this will depend greatly on the specific usage of the grammar

(c) Track results by using w3c-compliant utterance recording, and logging all shadow variables, so that these results can be used to further tune and tweak our grammar for maximum accuracy

(d) Consider using n-best post-processing as an additional confirmation step to ensure that the results we receive are indeed accurate

For today’s entry, lets assume that we need to track a three digit zip code, which are prevalent in Canadian locales. Our predefined format for utterance values are “Alpha Digit Alpha”, and luckily, not all alpha characters are applicable: Instead of trying to recognize 26 letters accurately, we only need to recognize 16, which helps a lot!

We won’t dig into the specifics of a mixed-initiative form dialog, as we have already done so in our mixed-initiatve tutorial, but the gist is that this feature of VoiceXML allows us to fill multiple fields with a single utterance, and breaking up each alpha and numeric character into it’s own recognition field greatly cuts down on disambiguation problems that can occur.

For the purposes of brevity, what we have below is a stripped-down version of our fully fleshed-out grammar, but you may download the full grammar, and the mixed-initiative dialog right here, which contains lots more inline notations.

<?xml version= "1.0"?><grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US">

<rule id="canadianZip" scope="public">

<one-of>

<!-- ALL THREE FIELDS FILLED -->

<item>

<item>

<ruleref uri="#alphaRule1"/>

<tag>out.alphaSlot1=rules.alphaRule1.alphaSlot1;</tag>

</item>

<item>

<ruleref uri="#numRule"/>

<tag>out.numSlot=rules.numRule.numSlot;</tag>

</item>

<item>

<ruleref uri="#alphaRule2"/>

<tag>out.alphaSlot2=rules.alphaRule2.alphaSlot2;</tag>

</item>

</item><!-- ONLY TWO FIELDS FILLED -->

<item>

<item>

<ruleref uri="#alphaRule1"/>

<tag>out.alphaSlot1=rules.alphaRule1.alphaSlot1;</tag>

</item>

<item>

<ruleref uri="#numRule"/>

<tag>out.numSlot=rules.numRule.numSlot;</tag>

</item>

</item>

<item>

<item>

<ruleref uri="#numRule"/>

<tag>out.numSlot=rules.numRule.numSlot;</tag>

</item>

<item>

<ruleref uri="#alphaRule2"/>

<tag>out.alphaSlot2=rules.alphaRule2.alphaSlot2;</tag>

</item>

</item>

<item>

<item>

<ruleref uri="#alphaRule1"/>

<tag>out.alphaSlot1=rules.alphaRule1.alphaSlot1;</tag>

</item>

<item>

<ruleref uri="#alphaRule2"/>

<tag>out.alphaSlot2=rules.alphaRule2.alphaSlot2;</tag>

</item>

</item>

<!-- ONLY ONE FIELD FILLED  -->

<item>

<ruleref uri="#alphaRule1"/>

<tag></tag>

</item>

<item>

<ruleref uri="#numRule"/>

<tag>out.numSlot=rules.numRule.numSlot;</tag>

</item>

</one-of>

</rule>

<rule id="alphaRule1" scope="public">

<one-of>

<item weight="1.0">

<one-of>

<item> ex</item>

<item> ax</item>

<item> x </item>

</one-of>

<tag>out.alphaSlot1="X"; </tag>

</item>

</one-of>

</rule>

<rule id="numRule" scope="public">

<one-of>

<item> one <tag>out.numSlot="1"; </tag>  </item>

</one-of>

</rule>

<rule id="alphaRule2" scope="public">

<one-of>

<item weight="1.0">

<one-of>

<item> ay</item>

</one-of>

<tag>out.alphaSlot2="A"; </tag>

</item>

</one-of>

</rule>

</grammar>

In brief, our top-level rule assumes that we can have any of the following entries:

"X1A""X"

"X1"

"XA"

"1"

"1A"

And in the event that we get one or two characters matched in our utterance, the VoiceXML mixed-initiative logic will then take over, and prompt the caller to fill in any “blanks” remaining.

A few things of note about the grammar defined below is that in the event that we receive only a single alpha utterance, we will assume that it is the first character, not the last. Additionally, when we construct a grammar that contains multiple slot returns, it is required that we explicitly define the slot values all the way up the chain: if we didn’t define the “out.[slotname]=rules.[rulename].[subslot]” within the context of the top-level rule, the last slot value would overwrite all others, meaning that we would only get a value for “alphaSlot2″ within the VoiceXML dialog. To illustrate even further, the below snippet for a top-level return would make this a reality:

<item> 
<ruleref uri="#alphaRule1"/>

<ruleref uri="#numRule"/>

<ruleref uri="#alphaRule2"/>

</item>

You’ll also see that each possibility for character recognition is specified within the top-level rule, so in the event that we get 1, 2 or 3 character strings, we can pipe the return value back to the VoiceXML, and let the mixed-initiative dialogs then access the sub-rules (alphaRule1/2 and numRule), individually as needed.

We also illustrated in brief how one can define multiple like-sounding utterance values that return the same interpretation value, and defined an for our alphaRule1 entry simply to show how this can be done: The task of taking this framework, and turning it into a grammar that satisfies any given project rests in the hands of you, the capable IVR developer.

=^)

Till next time,

Matthew Henry
Director of Customer Support
Voxeo Corporation

Useful Links

Technorati Tags:
, , , , , , ,

What is a “State Machine”? And how does it apply to CCXML (and James Bond)?

May 12th, 2008 by Dan York

ccxml.jpgWhen people come to CCXML from other languages, one concept that is sometimes difficult to understand is the whole notion of a “state machine”. Once you are comfortable with that idea, CCXML becomes rather easy to work with. While the Wikipedia page on state machines gets quite complex, let’s reduce the concept to some basics.

First, some vocabulary. In a “state machine”, there are a series of “states“, such as being “on” or “off”. There are then “events” that cause there to be a “transition” between states.

Now to illustrate this, let’s take the typical action hero film (such as any one of the James Bond movies) and describe it as a series of “states”:

  1. Hero is relaxing on a beach with a drink in some exotic locale.

  2. Hero is preparing for mission (getting briefing, gadgets, etc.)

  3. Hero is hunting for evil villain.

  4. Hero is fighting evil villain and his minions.

  5. Hero is relaxing on a beach in some other exotic locale in the company of rescued beautiful woman.

There are obviously other “states” that occur in an action movie, many of which involve beds, but in an effort to: a) keep this blog “safe for work”; and b) keep our example simple, we’ll reduce it to this list. Graphically, we could depict this as something like:

statemachineactionhero.jpg

The hero can remain in any one of these “states” indefinitely (and in some films it seems like the hero does!) until there is some event that triggers a “transition” between the states. Let’s look at our list again and add in some events:

  1. Hero is relaxing on a beach with a drink in some exotic locale.

    • Receives visit from courier who says his assistance is needed immediately. (Alternatively and more exciting for a movie, a team of assassins attempts to kill him.)
  2. Hero is preparing for mission (getting briefing, gadgets, etc.)

    • Receives final briefing, heads to airport, etc.
  3. Hero is hunting for evil villain.

    • Finds evil villain (or is found by evil villain).
  4. Hero is fighting evil villain and his minions.

    • Blows up villain and his lair, saves world, rescues beautiful woman.
  5. Hero is relaxing on a beach in some other exotic locale in the company of rescued beautiful woman.

At a 10,000 foot level, you can describe most action movies in this pattern. A series of states where events trigger transitions between those states.

So what does this have to do with CCXML, eh?

Well, you could take the basic successful inbound call to a CCXML application and describe it in the following states and events:

  1. CCXML application is waiting for connections.

    • Call is received.
  2. Application enters “Alerting” state to decide what to do with the call.

    • Application accepts call.
  3. Application is connected to incoming call and performs actions such as playing dialogs, accepting input, etc..

    • Application finishes - or caller hangs up.
  4. Application is disconnected from call and performs any final actions.

    • Application finishes post-call activity and exits.
  5. Application returns to waiting for connections.

Graphically, we could illustrate it like this:

statemachineccxmlsimplified.jpg

Conceptually, this is what it looks like. I would, though, note, that the “waiting” state I’ve shown here is not typically a part of the actual CCXML application but rather is part of the application platform on which your CCXML application is housed. For instance, the “platform” could be our Evolution hosted platform or a copy of our Prophecy premise platform running on your network. When a call is received by either Evolution or Prophecy, your CCXML application is loaded and (in this example) the “connection.alerting” event is sent which triggers the transition into the “Alerting” state.

You can think of this in a similar fashion to a web server. Your Apache (or other) web server is sitting there waiting for connections. When it receives a connection, it loads the appropriate page which may contain an application which is then executed. CCXML works in a similar fashion (and yes, there are exceptions… remember, I’m trying to keep this tutorial simple!).

In any event, let’s see what this looks like in CCXML code:

<?xml version="1.0" encoding="UTF-8"?>
<ccxml version="1.0">
  <eventprocessor>
    <transition event=”connection.alerting”>
    <log expr=”‘*** Incoming call from Caller ID: ‘ + event$.connection.remote”/>
    <accept/>
    </transition>
    <transition event=”connection.connected”>
      <log expr=”‘*** Call was accepted ***’”/>
      <disconnect/>
    </transition>
    <transition event=”connection.disconnected”>
      <log expr=”‘*** Call was disconnected ***’”/>
      <exit/>
    </transition>
  </eventprocessor>
</ccxml>

That’s it. The <transition> tag indicates a new state and the “event” attribute indicates the event that will trigger the transition into this new state. So when the event “connection.alerting” is received by this application the code in the first <transition> is executed. At the end of that block you can see the <accept/> command which is the action that causes a new event to occur. Likewise, <disconnect/> and <exit/> in later states signal that a new event has occurred and a transition needs to occur.

Your task, then, is to write the actions that occur in each state since this code above does really nothing except accept and then hangup a call (and generate log entries). During the “alerting” state, for instance, maybe there are some phone numbers from which you do not want to accept calls. You may have some conditional logic there that rejects calls from some numbers and then accepts calls from all others. In the “connected” state, obviously, is where the meat of your application goes. What are you going to do with the caller?

With this framework in mind, you can now dive into the “Learning CCXML” section of our CCXML documentation and see the examples there that flesh out the very simple outline I’ve given here.

I should note, of course, that my simple example doesn’t even closely illustrate all the “states” in CCXML. What happens if a call fails in some way other than just a disconnect? What if there are errors in your application? How about outbound calls where the “alerting” concept doesn’t make sense? We go into the different states in our documentation and the actual CCXML specification from the W3C also has this nice diagram (click on the image to see it larger):

ccxmlstatediagram.jpg

(Hint: For an outbound call, the initial state equivalent to “alerting” is “progressing“.) This, too, does not show all the states, but does provide a richer view of the flow of a typical CCXML application. As you’ll see in the documentation, there’s a lot more you can do with states in CCXML. You can create your own events that you us the <send> command to trigger a transition to a new state. There are a range of pre-defined states as well, that both our documentation and the W3C CCXML specification describe in more detail.

Again, it is all about the CCXML application describing a series of “states” and what actions occur during that state. Events trigger transitions between states.

Got it? Ready to start actually building applications? If so just head over to www.voxeo.com/free and sign up for either a free developer account on our Evolution hosted platform or download our free Prophecy premise platform to run on your own server. Figure out your states and away you go…

P.S. If you are really intrigued by all the theory around state machines, you might want to check out the “Semantics” section of another W3C draft language called SCXML (”State Chart XML”) which dives into the theory around Harel State Tables and much more. The aim of the (draft) SCXML effort is to create a more generic state machine language which is not tied to telephony as CCXML is. If all you want to do is write voice applications, feel free to skip these links entirely! :-)

Technorati Tags:
, , , , , , , , , , ,

Accessing Web Services From VoiceXML

May 8th, 2008 by Mark Headd

This is a guest post from Mark Headd, a voice application developer who was one of the first 10,000 users of our platform, and was originally published on his Vox Populi blog on May 6, 2008.


A few weeks ago, I posted about accessing web services from CCXML using PHP. This post will demonstrate how to do the same thing, only from VoiceXML. We’ll be using Voxeo Prophecy and PHP for this example. We’ll also be referring to the GreenPhone project — available free for download — for the sample code.

Before we dive in, its important to keep in mind that there are a number of different techniques for getting information from web services into a VoiceXML dialog. This is just one method — there are many others. Voxeo even has its own platform-specific way of accessing SOAP web services via JavaScript. Ultimately, the method you employ needs to be a good fit for the environment your working in and the requirements of your project.

Using the greenSoapClient Class

In the last post on this topic, I demonstrated how to use a simple PHP class as a way to access multiple SOAP-based web services from CCXML. This class forms the basis of our method for accessing web services from VoiceXML as well. However, in this instance, instead of using the CCXML <send/> element, we’ll use a VoiceXML subdialog.

Subdialogs in VoiceXML are typically used to create reusable dialog components for capturing common types of input, like a series of digits (e.g., credit card numbers, account numbers, etc). They can also be used to compartmentalize complex interactions with a caller and provide a simple interface for accessing results. By way of example, this is how the OSDMs from Nuance work, as well as the Targus service from Voxeo. We’ll borrow this approach to access a web service from StrikeIron that will send the details of an E85 or bio-diesel station to a cell phone via SMS.

Setting up our Subdialog

In order to send an SMS message with details on an E85 or bio-diesel station, we’ll need 2 things; the station details, and a cell phone number to send it to.

In order to send the details on a station from VoiceXML to PHP, we’ll pack it up in a pipe-delimited string called “detailsToSend” (I won’t go into too much detail about how this is done in this post — to learn more, refer to the GreenPhone Project code). The cell phone number we are sending to is obtained from the caller ID of the calling party, stored in a variable named “ani”. Details on how to access caller ID are given in a previous post.

Our subdialog call will look like this:

<form id="sendDetails">
<catch event="error.badfetch">
<prompt>
There was a problem sending the station details to your phone.
<break strength="weak"/>
</prompt>
<goto next="#goodbye"/>
</catch>

<subdialog name="sendSMS" src="../php/sendStationDetails.php" namelist="ani detailsToSend">
<prompt>
Sending the station details to
<say-as interpret-as="telephone"><value expr="ani"/></say-as>
</prompt>
<filled>
<if cond="sendSMS.result==0">
<prompt>Your message has been sent.<break strength="weak"/></prompt>
<else/>
<prompt>
There was a problem sending the station details to your phone.
<break strength="weak"/>
</prompt>
</if>
<goto next="#goodbye"/>
</filled>
</subdialog>
</form>

We use the attributes on the <subdialog> element to give our subdialog a name (which we’ll use to access the results sent back from PHP), to specify where to POST our variables to and also to specify which variables to POST.

You’ll also notice that we have set up a handler here for an “error.badfetch” event. This is a good habit to get into whenever you set up a request to an external resource (like a PHP script). If the script isn’t there or has problems, an “error.badfetch” event will get returned and unless you specified a handler for this event, your day will not end well.

Additionally, we’ve set up logic in our filled block to inspect the result of the subdialog call. We access the result as a property of the subdialog, using the name we set up in the <subdialog> element and the dot notation (”.”) familiar to JavaScript.

<if cond=”sendSMS.result==0″>

… code logic goes here …

</if>

With this in mind, our PHP script needs to send back a variable called “result”. How do we do this? Lets take a look at the PHP script:

A Simple Subdialog using PHP

The subdialog that we want to render is extremely simple — we only need to render enough VoiceXML to declare a variable called “result” and return it to the parent dialog. We’ll do this after we make our web service call to send the SMS message.

There are two pieces of information returned from the StrikeIron web service that we are interested in; a string that holds the response message from the service (i.e., “success”, “failure”, etc.) and a number indicating the outcome of the web service call.

We’ll take these two bits if information and assign them to PHP variables:

$result = $xml->soapHeader->ResponseInfo->ResponseCode;
$message = $xml->soapHeader->ResponseInfo->Response;

Now, we want to write out these variables in a simple VoiceXML subdialog:

<?xml version="1.0" encoding="utf-8"?>
<vxml version="2.1" xmlns="http://www.w3.org/2001/vxml">
<form id="F_1">
<log>*** SMS response message was: <?php echo $message; ?>. ***</log>
<block>
<var name="result" expr="<?php echo $result ?>"/>
<return namelist="result"/>
</block>
</form>
</vxml>

As discussed above, this creates just enough VoiceXML to instantiate a variable and return it to the parent dialog. For good measure, we’ll write out the web service string (contained in the PHP variable $message) as a log statement, in case it contains information we want to look at later.

Why This Approach?

Using this technique for accessing web services from VoiceXML provides a couple of advantages. First, it allows us to completely separate the presentation layer (the VoiceXML) from the logic used to invoke the web service. This is a fairly standard design practice that makes creating the dialog much easier for a developer that does not necessarily know a whole lot about web services. With this approach, they don’t really need to — they only need to know that the subdialog call will return a variable called “result” whose value can be inspected to determine what to do next.

Additionally, because the parent dialog is just static VoiceXML it may be possible to cache it. Since the parent dialog isn’t dynamic, it can be cached for fast access, while the subdialog — which must be dynamic — is the only component sent from the web server to the VoiceXML platform each time a caller accesses the application. Careful design can yield additional caching opportunities that can make your applications more efficient and less bandwidth intensive.

In the next post, we’ll explore one additional method for accessing web service from VoiceXML. Stay tuned…

Technorati Tags:
, , , , , ,

CCXML and SIP, Part 1: Accessing SIP headers

May 6th, 2008 by Dan York

ccxml.jpgIf you use SIP to connect to your voice appliction, one of the very nice things about CCXML is that you have full access to the underlying SIP headers that were sent as part of the SIP connection. With access to the SIP headers, you can then record information or make decisions in your code based on the contents of those headers.

First, let’s take a look at a typical SIP INVITE message that begins a call between two parties:

INVITE sip:1234@company2.com SIP/2.0
Via: SIP/2.0/UDP proxy.company2.com:5060;branch=z9hG5bK21ghi7ab34
Via: SIP/2.0/UDP sip.company1.com:5060;branch=z9hG4bKnashds7
To: sip:1234@company2.com
From: sip:dan@company1.com;tag=451248
Call-ID: 324817637683475998ababcc10
CSeq: 1 INVITE
Contact: sip:dan@company1.com
Max-Forwards: 50
P-Asserted-Identity: "Dan York" <sip:dan@company2.com>

In CCXML, any and all of those headers are available to you using the following syntax:

event.connection.protocol.sip.headers['To']

You can use this information in a conditional statement, in a variable, or in a log statement such as this:

<log expr="'*** The SIP To header is ' + event$.connection.protocol.sip.headers['To']"/>

Note that for the headers whose names do not include a dash in them, there is also a shorter style:

<log expr="'*** The SIP From header is ' + event$.connection.protocol.sip.headers.to"/>

If the header names do include a dash in them, then they do need to be enclosed in brackets and single quotes. Here are some more examples of accessing the SIP headers from a connection object:

event.connection.protocol.sip.headers['To']
event.connection.protocol.sip.headers['P-Asserted-Identity']
event.connection.protocol.sip.headers.from

Let’s take a look at where you might see this code (shown in red) appear within a (admittedly VERY basic and not very useful) CCXML file:

<?xml version="1.0" encoding="UTF-8"?>
<ccxml version="1.0">
  <eventprocessor>
    <transition event=”connection.alerting”>
    <log expr=”‘*** The calledID is ‘ + event$.connection.local”/>
    <log expr=”‘*** The caller ID is ‘ + event$.connection.remote”/>
    <log expr=”‘*** The SIP From header is ‘ + event$.connection.protocol.sip.headers.from”/>
    <accept/>
    </transition>
    <transition event=”connection.connected”>
      <log expr=”‘*** Call was accepted ***’”/>
      <disconnect/>
    </transition>
    <transition event=”connection.disconnected”>
      <log expr=”‘*** Call was disconnected ***’”/>
      <exit/>
    </transition>
    <transition event=”connection.failed”>
      <exit/>
    </transition>
  </eventprocessor>
</ccxml>

Now here all we did was access the SIP header and then log one piece of information. Next time we’ll take a look at a more involved example where we use the SIP headers to change the actions inside the CCXML application.


If you would like to try out this code in a working environment head on over to www.voxeo.com/free and either join our (free) hosted development platform or download our (free) Prophecy software.

Technorati Tags:
, , , , , ,

Certified Tech Tip: Alpha-Numeric voice recognition grammars - Part One

May 5th, 2008 by matt

voicexmlcertifieddeveloper.gif

Quite often, the topic of how a developer should construct alphabetical “spell-out” grammars, or how one can best create an alpha-numeric recognition grammar is posed to the support team at voxeo. Many a posting to our VoiceXML developer forums has touched on this subject, but we haven’t really delved into this in precise detail to explain exactly why this is such a challenge until now.

“Alphabetical recognition is a challenge?”, you ask? You bet it is, if you want to get any semblance of accurate recognition results. And when we throw alpha characters, and maybe some numeric characters within the same utterance string, then we are really looking at a difficult grammar to get tuned to a point where it is usable.

So whats the big deal, anyhow?
The inherent problems with spelled input recognition is best illustrated by a simple anecdote:

Imagine that you are at a restaurant on a busy Friday evening, and waiting for your table. While in the lobby, there are people chatting, children cavorting about, and harried workers trying to seat the flood of diners. At the same time, your friends who are joining you for dinner call to say that they are lost, and ask for directions to the restaurant. Amidst all the background chatter, glasses clinking and the rest of the noisy distractions, how many times do you have to repeat “From I-95, get off on exit 76B, and then take a left at Montana street” before your buddy is able to accurately understand what you are saying? In this worst case scenario, at best you may have to repeat yourself only once. Even if the restaurant was dead empty, and as silent as a tomb, the chances for your pal misunderstanding “exit 76B” for “exit 763″ or something similar is not only quite plausible, but highly likely.

The root of the problem with alpha grammars, and even more so with alpha-numeric grammars is the staggering chance of disambiguation of like-sounding matches: “B” sounds like “C”, sounds like “Z”, sounds like “E”, sounds like “three”, and “M” sounds like “N” sounds like “ten”……you get the picture. And this is for a *single character match* only: To further illustrate the challenges that we face, consider the fact that a 1-character alphabet grammar has only 26 possible results. But a 7-character grammar would have over eight BILLION possibilities. As you can imagine, the amount of possible results for an alphagram of arbitrary length is simply staggering.

Suggestions for alphabetical voice reco: Alternative Options
Firstly, constructing a user-defined alphabet grammar is something that we don’t really recommend attempting for “spell anything” applications, as the plain, unvarnished truth is that todays voice recognition technology is simply not up to the task. To be certain, improvements in ASR technology over the past few years has seen dramatic improvements, but not so much as to allow us to spell, or say just any old utterance and expect accurate match results. In a lot of cases, a Statistical Language Model grammar will do the job, assuming that you expect your callers to input certain types of input, such as a first name, a city name, or a state name.

While this isn’t the time or place to cover SLM grammars in depth, a brief summary should explain the strengths of these pre-compiled, pre-tuned grammars. SLM grammars in the context of spelling are essentially designed to fill in the blanks when we have partial input, using predetermined logic that is tailored to the input context/category. For instance, assume that we have an SLM firstname grammar active (note that these are available when using the Prophecy + Nuance platform on the evolution.voxeo.com portal), and our spelled utterance from the caller reads like what we have below, where unrecognizable utterance fragments are represented by a question mark:

“C O R ? E L I ? S”

Using the pre-tuned logic that is part of the SLM grammar, the ASR will determine that there are no firstname matches that read as “Coraelias”, or “Corbelibs”, etc: It will make the decision that the only first name that matches this pattern where some fragments of the utterance are missing would be “Cornelius”: This is the gist of how SLM grammars work, and if your project allows you to use somewhat narrower categories for any utterance you want to recognize, then using a predefined SLM grammar, or even crafting your own SLM grammar is a better way to go than trying to make a flat-file alphabetical SRGS file.

One of the common tasks for alphabetical grammars seems to be the capture of names, or street addresses, and if this is the case, there is a very accurate add-on service that can handle this task rather nicely. The TargusInfo feature allows developers to access one, or both of these two services:

* Name & Address lookups based on Caller ID
* Pre-tuned name & full address grammars

These services are remarkable in terms of Caller ID-to-Address accuracy, and the name/address grammars are top-notch, and quite acceptable for full scale, enterprise deployments as well. The only caveats to using this is that this service is limited to the United States only, and there is an applicable per-transaction fee to use this in a production capacity. However, we can honor developer requests to test drive this service by allowing a 30 or so hits to this service at no charge. Developers interested in this service can login to their evolution.voxeo.com accounts, and create an account ticket requesting access to this service to see just how good it is. And trust me on this one: You’ll be mightily impressed, and more importantly, so will your callers.

If you gotta do it…
In the event that the SLM grammar option, or the TargusInfo option won’t fit the bill for your IVR project, then you may well be forced to try and craft a flat-file Alpha Grammar using w3c-compliant SRGS/SISR syntaxes. If you do fall into this category, we can give you some advice on doing so, with the full disclaimer that Results May Vary, and that 100% recognition accuracy using this methodology is Science Fiction, at least for the time being.

* Start small by testing one-character strings so that you can tune and tweak utterance values in the grammar.

* Track user utterance, and confidence scores via “lastresult$” shadow variables for post testing analysis, and as a basis for what needs to be tuned.

* Leverage the VXML 2.1 utterance recording via the “recordutterance” setting, and save off all user recording data for post-call analysis.

* Flesh out utterance values by phonetically sounding them out: For instance, “a” could be represented by:

a
ay
eh

* Try to get as broad a user base as possible for testing, else you run the risk of tuning your grammar to a small subset of user speech patterns. If you have but a single grammar tester who happens to have a Deep South accent, then the tuned grammar will likely not be much good to callers in New York, or our friends in the UK.

* After each round of changes that you apply to your grammars, test them thoroughly, analyze the results, and then test them again. Then test once more just to be sure of your results.

* Careful use of grammar weighting can really save the day for like-sounding characters. The chances of a user utterance of “E” is much higher than one of “Z”, but be very careful when applying weights, as it is possible to go overboard when doing so, and weight your grammar to hard in favor of one particular letter, which will then skew your recognition results and accuracy.

* Consider using n-best post-processing when overall recognition confidence scores are below a certain threshold: It’s much better to take the extra step to get confirmed accuracy than to assume wrongly.

* For utterance strings that are static in length, implementing a mixed initiative dialog can be an excellent tactic to cut down on the disambiguation factor that skyrockets when the string length grows in size. This can be a tricky project to get right, but it is one that is well worth the effort in development.

Next TechTip: In the next certified tech tip from the Voxeo support team, we will illustrate our last suggestion in detail. That’s right, we will take on the task of posting and dissecting a mixed initiative dialog, and the associated alphanumeric grammar that could accept Canadian zip code input. As we stated before in non-nonsense terms, this is possibly one of the hardest, if not *the* hardest things that a developer can attempt to do reliably, but as you are well aware, the Voxeo team is quite fearless, and doesn’t respect the concept of “impossible”.

Till next time,

Matthew Henry
Director of Customer Support
Voxeo Corporation

Useful Links
Statistical Language Model Grammars
Nuance Grammar Developers Guide
Mixed-Initiative dialog tutorial
SRGS Grammar Specification: Grammar weighting
SISR Grammar Specification
VXML 2.0 specification: The LastResult array
VXML 2.1 specification: Utterance Recording

JavaScript Trick for Voice Applications

April 28th, 2008 by Mark Headd

This is a guest post from Mark Headd, a voice application developer who was one of the first 10,000 users of our platform, and was originally published on his Vox Populi blog on April 25, 2008.


There are times when it is desirable to change the behavior of a VoiceXML application based on a specific setting.For example, the GreenPhone application that I have mentioned in several previous posts has a setting that can be used to control whether special audio files are played. I personally find these audio files funny and somewhat endearing — others may not. To control whether they are played, there is a variable in the application root document called (cleverly) playAudio.

<var name="playAudio" expr="true"/>

It’s default setting is true, and this can be changed to false to prevent these files from playing. The typical method for checking a variable like this one to determine if an audio file should be played looks something like this:

<if cond="playAudio">
  <audio src="myFile.wav"/>
</if>

There isn’t anything wrong with this, and since there isn’t a “cond” attribute on the <audio/> tag there aren’t very many good alternatives. There is one alternative method that I rather fancy that uses the JavaScript conditional operator to distill this to a single line of code:

<audio expr="playAudio ? 'myRealAudioFile.wav' : 'myFakeAudioFile.wav'"/>

This shortcut allows us to assign a value to the audio file reference via the “expr” attribute, instead of using an explicit URI to the location of an audio file. The way the operator behaves is to first evaluate the condition on the far left side — if it evaluates to true then the first expression is assigned as the URI of the audio file. If it evaluates to false, then the second expression is used.

The trick here is that the second expression resolves to a bogus audio file — it doesn’t exist. This will not cause a fatal error in your application, it will simply cause Prophecy not to play an audio file (it can’t because the file doesn’t exist).

The JavaScript conditional operator can come in very handy in CCXML as well. For example, there are times in CCXML where I want to use <dialogterminate/> to end a call, but I may not be certain which dialog a caller is in — the JavaScript conditional operator can come in handy here:

<dialogterminate dialogid="loggedIn ? voiceMailDialog : loginDialog"/>

Since the “dialogid” attribute is an expression, we can use the JavaScript conditional operator to check and see if a caller has logged into a voice mail system to retrieve their voicemail. If there loggedIn status is true, we assume that they are in the voiceMailDialog and yank them from that. Otherwise, we assume they are in the first dialog and yank from there.

There are surely other ways to do these things, but in my humble opinion the JavaScript conditional operator deserves some attention as a powerful shortcut for doing things in CCXML or VoiceXML using the Voxeo Prophecy platform.

Technorati Tags:
, , , , , , ,

Accessing Web Services From CCXML

April 28th, 2008 by Mark Headd

This is a guest post from Mark Headd, a voice application developer who was one of the first 10,000 users of our platform, and was originally published on his Vox Populi blog on April 21, 2008.


This is the first in a series of posts that will highlight how to accomplish specific things using the Voxeo Prophecy platform. All of the examples that will be discussed draw directly from the GreenPhone project discussed in a previous post.

The first issue that will be discussed – accessing web services from CCXML using PHP.

One of the very cool things about Prophecy is that it comes bundled with the PHP scripting language. In fact, I have on occasion referred to PHP as Prophecy’s “embedded scripting language.” PHP 5 comes with an abundance of features that will be of interest to IVR developers – chief among them, the ability to create SOAP clients to interact with web services, and the ability to easily work with data in XML format using the SimpleXML extension.

If you’ve read my previous post on the GreenPhone Project, you will know that I am using a collection of web services from StrikeIron that includes a web service to provide information on U.S. area codes. If we pass this web service an area code, it will return the U.S. state that area code is in. Ultimately, what the GreenPhone application will do is look up E85 and bio-diesel stations by state. So when a person calls the application, we want to use their area code to look up what state they are in – thereby saving them the trouble of entering this information manually.

In CCXML, we can access the caller’s ANI via the Connection Object:

<transition state="initial" event="connection.alerting">
 <log expr="'*** Call is coming in.  Lookup area code information. ***'"/>
 <assign name="ani" expr="event$.connection.remote"/>
 <assign name="areacode" expr="ani.substring(0,3)"/>
 <send target="'php/areaCodeLookup.php'" name="'lookupEvent'" targettype="'basichttp'" namelist="areacode"/>
</transition>

This block of code show how to set up a transition to access the ANI on an incoming call. When an incoming call is detected by Prophecy, the “connection.alerting” event is delivered and we have access to the Connection Object’s “remote” property – this property exposes the telephone URL for the device that is calling into the platform. Note – in my previous post, I explained the process of setting up the Prophecy SIP phone to deliver a specific ANI. This is how we access the value that is set in the Prophcy SIP phone.

We assign the ANI value to a variable we have previously declared and (very cleverly) called “ani”, and then we grab the first 3 characters of this string (using the ECMAScript substring method) and assign them to another variable called “areacode”. We then pass the area code value to a PHP script that will interact with the StrikeIron area code web service.

Using the CCXML <send/> element in this fashion is identical to an HTTP GET with the areacode variable appended to the URL of the PHP script, like this:

http://myserver/php/areaCodeLookup.php?areacode=123

There several possible outcomes of this HTTP request:

  1. Our PHP script was able to successfully interact with the StrikeIron web service and lookup the U.S. state information for the submitted area code;
  2. Our PHP script was able to successfully interact with the StrikeIron web service but was not able to lookup the state information for the submitted area code (bad area code);
  3. Something went wrong (an exception occurred) while trying to interact with the web service; or,
  4. Something really went wrong and our HTTP request resulted in a bad response from the server.

We need to set up handlers for each possible outcome – we won’t discuss them in detail until after we look more closely at the PHP components that are interacting with the StrikeIron web service, but to summarize what we’ll need, here they are:

<transition state="lookup" event="areaCodeLookupSuccess">
</transition>
<transition state="lookup" event="areaCodeLookupFailure">
</transition>
<transition state="lookup" event="error.send.failed">
</transition>

The first two handlers react to custom events that we will toss into the CCXML event stream (more on that shortly), and the last will take care of instances where we get an invalid response back from the server (e.g., a 404 response). Now lets look at the PHP components that interact with the StrikeIron web services.

When the HTTP request from Prophecy that holds our area code information is received in PHP, we can access the submitted value by using the PHP $_REQUEST superglobal:

$areacode = (int) $_REQUEST['areacode'];

You’ll notice that we also typecast the value as a way of cleansing the input – as with any other kind of web application, never trust user input. Even though we’re not using the submitted information in a SQL query, this is a really good habit to get into. There are certainly other ways to achieve this, but type casting is simple and effective for our purposes.

The PHP version that comes bundled with Prophecy has support for PHP’s SOAP extension right out of the box. Since we’re going to be accessing several different web services over the course of one telephone call, I decided to set up a very simple class to handle all of the interactions with the StrikeIron web services.

class greenSoapClient {
  private $client;
  private $headers;
  function __construct($type) {
    global $WSDL, $USER, $PSWD;
    $this->client = new SoapClient($WSDL[$type], array('trace' => 1,
                                          'exceptions' => 0));
    $headerArray = array("RegisteredUser" => array("UserID" => $USER,
                                          "Password" => $PSWD));
    $this->headers = new SoapHeader("http://ws.strikeiron.com",
                                          "LicenseInfo", $headerArray);
  }
  function makeSoapCall($name, $params) {
    $result = $this->client->__call($name, array($params), NULL, $this->headers);
    return $this->client->__getLastResponse();
  }
  function __destruct() {
    unset($this->client);
  }
}

This class has only three functions – a constructor, a destructor and a function to make the call to the SOAP method we want information from.

When we instantiate the greenSoapClient class, we pass in a reference to a WSDL file for the service we want to invoke. In this case, we will pass in a reference to the WSDL file for the U.S. Area Code Information Web Service. (Actually, the string “areaCode” is used to access the WSDL reference from a pre-established associative array holding the URL references for all of the WSDL files used by the greenPhone application.)

$mySoapClient = new greenSoapClient("areacode");

Now that we have our area code information, and a shiny new greenSoapClient object to work with, we can make our SOAP call:

$param = array('AreaCode' => $areacode);
$response = $mySoapClient->makeSoapCall('GetAreaCode', $param);

The variable $response now holds the XML response that was returned from the web service. We’ll need to process this response in order to properly format the information we want to return to CCXML.

One of the very cool things about the Voxeo implementation of CCXML is that developers can toss custom events into the CCXML event stream using simple HTTP responses. Prophecy lets us send back a custom event, as well as any data that we want to access in CCXML as properties of that event. We do this by formatting our response as follows:

First line of body of HTTP response = custom event name.
Data to be returned to CCXML = name value pair appearing on successive lines of the HTTP body, one pair per line.

The U.S. Area Code Information Web Service returns two pieces of information that we want to access in CCXML – a count of the number of locations identified for each area code (typically 1), and the name of the U.S. state that area code belongs to. A snippet of the raw response returned from the web service might look something like this (for the 610 area code):

<ServiceResult>
 <Count>1</Count>
 <AreaCodes>
  <AreaCodeInfo>
   <AreaCode>610</AreaCode>
   <Location>Pennsylvania</Location>
  </AreaCodeInfo>
 </AreaCodes>
</ServiceResult>

We want to format our raw XML response like so:

areaCodeLookupSuccess
count=1
location=Pennsylvania

The easiest way to do this in PHP is to use the SimpleXML extension:

$xml = new SimpleXmlElement($response);
$result = $xml->soapBody->GetAreaCodeResponse->GetAreaCodeResult;
$output = "areaCodeLookupSuccess\n";
$output .= "count=".$result->ServiceResult->Count."\n";
$output .= "location=".$result->ServiceResult->AreaCodes->AreaCodeInfo->Location."\n\n";

We take the response from the StrikeIron web service and use it to create a new SimpleXML object. We can then access the values we want and build our HTTP response.

How do we deliver our response once we’re done constructing it, we simply use the PHP “echo” language construct to write it out:

echo $output;

Now that we’ve returned our values to CCXML, how do we access them? For the answer to that,we need to go back to the handlers we set up previously, most importantly the handler for the custom “areaCodeLookupSuccess” event:

<transition state="lookup" event="areaCodeLookupSuccess">
 <assign name="count" expr="event$.count"/>
 <if cond="count == 1">
  <assign name="location" expr="event$.location"/>
  <assign name="stateCode" expr="getStateCode(event$.location)"/>
  <assign name="myState" expr="'accepting'"/>
  <accept connectionid="connection_id"/>
 <else/>
  <log expr="'*** Could not look up area code. ***'"/>
  <reject/>
 </if>
</transition>

When we write out our web service response in PHP, we can cause a custom event to drop into the CCXML event stream – the name of this event is the first line of the HTTP response we just constructed – areaCodeLookupSuccess.

We access the values we just returned to CCXML as properties of the areaCodeLookupSuccess event using the “event$.” vernacular. This allows us to assign these values to ECMAScript variables that we have previously declared. It also lets us decide how we want our application to react, based on certain conditions (e.g., if count = 0).

Similarly, our other event handlers can be used if we get an unexpected response form the web service – we could send back a “areaCodeLookupFailure” event. If something really bad happens – like an invalid response from the web server we will get an “error.send.failed” event, so we’ll want to have a handler ready for that as well.

Now that you have a flavor for how to access web services using CCXML and PHP, we’ll look at two different techniques for returning information from a web service to VoiceXML. We’ll cover these two techniques in the next two posts. Stay tuned…

Technorati Tags:
, , , , , ,

Earth Day Special Project: Project Green Phone

April 27th, 2008 by Mark Headd

This is a guest post from Mark Headd, a voice application developer who was one of the first 10,000 users of our platform, and was originally published on his Vox Populi blog on April 17, 2008.


Earth Day 2008 is fast approaching, so I wanted to try and build something that would help the environment and also be a cool demonstration of telephone applications generally, and the Voxeo Prophecy platform in particular.

I decided to whip up a simple application that would allow a caller to search for E85 and Bio-diesel fuel stations in their state. Some of the specific goals that I had in mind when I got started were:

  • To make use of the Voxeo Prophecy platform, the premiere VoiceXML/CCXML platform for building voice applications (at least in my opinion).
  • To code the application entirely in VoiceXML, CCXML, ECMAScript and PHP (that’s right, no database!).
  • To integrate with SOAP-based web services to obtain data on E85 and Bi-Diesel station locations, and to do other cool stuff like send an SMS message from VoiceXML.
  • To make use of interesting and unique audio files for prompts and to signal specific types of outcomes.

The fruits of one weekend of labor can be downloaded here. To set up and test this application, you will need the following:

  • An account with StrikeIron to use the web services that drive the GreenPhone application.
  • A copy of Voxeo Prophecy.
  • A good headset and microphone (to place test calls using Prophecy).
  • A cell phone (preferably one with a liberal text messaging contract).

Sign Up With StrikeIron:

Create an account with StrikeIron and sign up for the Super Data Pack Web Service. This is a collection of web services that allow for up to 10,000 hits / month at no charge (where are you going to get a better deal than that?). You’ll also want to sign up for the Global SMS Pro Web Service – this is the service that is used to send SMS messages from the GreenPhone application. Note – this service is priced quite differently than the Super Data Pack Web Service – only 10 free hits before you start paying. If you want to use this service for anything more than just testing out how to send an SMS message from Voxeo Prophecy, you’ll need to get your wallet out.

Make note of the user ID (email address) and password used to create your StrikeIron account – these will be needed momentarily.

Download and install Voxeo Prophecy:

Download and install the Voxeo Prophecy software. Follow all of the instructions for installing and obtaining a license – a two-port license (which will support 2 concurrent phone calls) is free. Right now, prophecy only runs on Windows, but a Linux version is in the pipeline.

Download and Configure GreenPhone:

Download the GreenPhone application and extract it to a new directory under c:\{Prophecy install path}\www\. (For example, on my Windows machine I’ve extracted to c:\Program Files\Voxeo\www\GreenPhone\). You don’t have to run the GreenPhone application on the same machine as Prophecy – if you decide to deploy it on another machine, it must support PHP 5 – GreenPhone makes use of the PHP SOAP and SimpleXML extensions.

Once this is complete, navigate to the directory where you just extracted the GreenPhone application files. Go to the directory called “php”, and open the file called common.php. At the top of this file, enter the credentials from your StrikeIron account. Save and close the file.

Creating a Call Route for GreenPhone:

Open the Prophecy Management Console in your web browser (http://127.0.0.1:9995/mc.php) – the default user ID and password are admin/admin. Click on the “Call Routing” option on the left hand menu – this is where you will set up a call route to the GreenPhone application.

Pick one of the numbered route Ids (e.g., Route 1 ID) and make the following changes:

  • Change the route ID to green
  • Change the Route Type to CCXML W3C
  • Change the URL to http://127.0.0.1:9990/{ GreenPhone Install Directory}/greenPhoneStart.xml
  • Scroll to the bottom of the page and click “Save Changes”

Making a test call:

Now that Prophecy is installed, fire up the SIP Phone that it is bundled with – you should see the Prophecy icon in your system tray. Click on it, and select “SIP Phone” from the menu. When the SIP Phone launches, select Options. In the SIP Proxy / Registrar Options section, enter your cell phone number in the Local Username field (e.g., 2125551234). Click OK, and restart your SIP Phone. This last step allows your cell phone number to be delivered as the caller ID (or ANI) on the test call you are about the make, even though your initiating the call from a SIP phone.

GreenPhone is built to use ANI to look up E85 and Bio-Diesel stations in the caller’s home state. We do this by invoking the U.S. Area Code Information Web Service that is part of the StrikeIron Super Data Pack to determine which state a caller is calling from. There are additional web services in the StrikeIron Super Data Pack that we can invoke to locate Bio-Diesel stations and E85 Stations — the methods invoked on these last two services require us to identify the state we want a listing of stations for.

The caller’s ANI is also used to send the details on a particular E85 or Bio-Diesel station via text message to the caller’s phone – so if you enter your cell phone number in the Voxeo SIP Phone as described above, you can get details on a station that may be near you sent directly to your cell phone.

As an aside, you’ll notice that a single phone call can result in up to 4 web service invocations — not really sure if that’s “too many” but there are probably some opportunities for caching that I’ll be discussing in the next couple of posts on this, as I describe in more detail how to interact with web services via Voxeo Prophecy.

Now you are ready to place a test call. When your SIP Phone restarts, go to the field called Dial String and enter “sip:green@127.0.0.1” (without the quotes). Click dial and you are now interacting with the GreenPhone application!

You’ll notice (and hopefully enjoy) the unique sounds I’ve tried to used throughout the application. All of them were obtained from the FreeSound Project and modified to conform to the Prophecy standard for audio files with Audacity.

There are some obvious limitations to how this application currently works, and the VUI clearly needs some refinement (DTMF only at this point).

In the next several posts, I’ll point to this application to discuss examples on how to accomplish things in VoiceXML and CCXML using the Voxeo Prophecy Platform.

Have a happy Earth Day on 4/22!!

A new series of guest posts/tutorials coming to this blog…

April 27th, 2008 by Dan York

….___ VOX POPULI ___…. » Earth Day Special Project_ GreenPhone.jpgYou’ll soon see a new series of tutorials coming to this weblog as guest posts from Mark Headd. Mark is a voice application developer and member of our developer community who is very proud of the fact that he’s been using our platform long enough to have a developer ID under 10,000! He writes over on his own weblog, Vox Populi, (more info on Mark on his About page) and recently kicked off a series of posts around how to use CCXML and VoiceXML with PHP and other languages. I asked Mark if he would grant permission for us to cross-post his articles here with attribution and he gladly granted that. Mark has a great writing style and some great tutorials and so we are thrilled to be bringing his writing to you. Stay tuned…

Technorati Tags:
,

Voice Mashups with Twitter, part 2: Sending telephony presence to Twitter

April 21st, 2008 by Dan York

What if you wanted to share your telephony “presence” information with another application? i.e. you wanted to let the application know whether or not you were on the phone? For instance, when someone called you a message that you were “on the phone” could then be displayed in the other application…. perhaps a web page with a directory of staff - showing who’s on the phone… perhaps an instant messaging client…

twitter.pngWell, out at eComm 2008 in March, our CTO, RJ Auburn, demonstrated exactly that kind of integration using just CCXML and web services. In his talk he showed a quick application in CCXML that would send out your presence information on the current web 2.0 darling Twitter. Essentially, what happens is this:

  • Someone calls a phone number (presumably because you gave it to them)

  • Call is connected to your actual phone
  • Call presence information is sent out in your Twitter stream.

For instance, if you call one of these numbers (please do so only if you actually want to talk to me, and please only from 9am-5pm Eastern US time- thanks!):

You’ll reach me (or my voicemail) and the corresponding status updates will appear in my Twitter stream (shown in reverse chronological order):
twitterphonepresence.jpg

Now I don’t know that I would really personally want to send out this information in my twitter stream every time someone called me (although in all honesty I don’t talk on the phone as much as I used to), but you get the idea. Your “telephony presence” can be sent out to another application. It’s to me a very cool example of how you can easily mashup voice with web services. While Twitter is used here for this example, the code could basically be used to send this presence information to any type of service that lets you communicate using simple web services. Let’s dive in a bit further…

The eComm Slides

First, though, I should mention that this example was part of RJ’s talk at eComm 2008 and you can see it in his slide deck starting at slide 27:

As soon as audio is available for the presentation, we’ll provide a link here to actually listen to the presentation.

The Web Service

Now to jump into the actual code, RJ was able to do this so easily largely because Twitter’s API is so incredibly simple to use, as I discussed in a previous post about Twitter. The full CCXML code is below, but here’s the key part where RJ defined the URL to use to update Twitter:

  <var name="tURL"
       expr="'http://zscgeek:password@twitter.com/statuses/

That’s it. (Note that while RJ is on Twitter as zscgeek, you can rest assured that his real password is NOT “password”!)

After creating this variable “tURL” (as in “target URL”), RJ proceeds to simply assign some text to a variable “status” and then call the target URL with that “status” variable as an argument. For example:

      <var name="status" expr="'RJ is on the phone'"/>
      <send targettype="'basichttp'" name="'update'"
            target="tURL" namelist="status"/>

Here “RJ is on the phone” is assigned to “status” and then the Twitter API is called. As shown in the code below, this same block of code is re-used with each different telephony state (and obviously with a different status message).

The Code

So here’s the code… nice and short and sweet… just enough to fit on a Keynote slide without straining eyesight (yes, it would probably fit on a PowerPoint slide, too, but remember that we’re Mac fans here). I’m not going to walk through each step of the code, but if you scan down you can see that basically the code is:

  • Upon connection of the call:
    1. connecting the call to RJ’s cell phone (not his real number)
    2. sending the “RJ is on the phone” status update to Twitter
  • Upon entering one of the other states (no answer, call disconnected), sending the appropriate Twitter status update.

Now if you aren’t familiar with the power of CCXML, you might want to look at our documentation and tutorial on CCXML or view one of the video tutorials on CCXML that we recently posted.

With that, here’s the code:


<?xml version="1.0" encoding="UTF-8"?>
<ccxml xmlns="http://www.w3.org/2002/09/ccxml" version="1.0">
  <var name="state" expr="'init'"/>
  <var name="incomingcall"/>
  <var name="tURL"
       expr="'http://zscgeek:password@twitter.com/statuses/update.xml'"/>
  <eventprocessor statevariable="state">
    <transition event="connection.alerting" state="init">
      <accept/>
    </transition>
    <transition event="connection.connected" state="init">
      <assign name="state" expr="'calling'"/>
      <assign name="incomingcall" expr="event$.connectionid"/>
      <createcall dest="'tel:+18315551111'"/>
    </transition>
    <transition event="connection.connected" state="calling">
      <assign name="state" expr="'connected'"/>
      <join id1="event$.connectionid" id2="incomingcall"/>
      <var name="status" expr="'RJ is on the phone'"/>
      <send targettype="'basichttp'" name="'update'"
            target="tURL" namelist="status"/>
    </transition>
    <transition event="connection.failed" state="calling">
      <assign name="state" expr="'done'"/>
      <var name="status" expr="'RJ is not answering his phone'"/>
      <send targettype="'basichttp'" name="'update'"
            target="tURL" namelist="status"/>
    </transition>
    <transition event="connection.disconnected" state="connected">
      <assign name="state" expr="'done'"/>
      <var name="status" expr="'RJ is off the phone'"/>
      <send targettype="'basichttp'" name="'update'"
            target="tURL" namelist="status"/>
    </transition>
    <transition event="send.successful" state="done">
      <exit/>
    </transition>
  </eventprocessor>
</ccxml>

Feel free to use it, modify it, etc., etc. (And if you do something cool with it, please do let us know, either as a reply to this post or via email.) While this is with Twitter, we’d love to hear where else you can think of sending telephony presence info…

P.S. If you’d like to experiment with this but are not sure of how to get started, head on over to www.voxeo.com/free and either sign up for a free developer account on our Evolution portal or download our free Prophecy software to run it on your own server.

Technorati Tags:
, , , , , , , , , ,