Posts Tagged ‘Voxeo’

Never forget to make a call with scheduled conferencing

Thursday, July 10th, 2008

We all forget to make important calls from time-to-time. With this tutorial, you will be able to schedule a call ahead of time, so that Voxeo’s IVR system calls you at that time in the future, and then links you with the party you intended to call.

While you’re utilizing your free Voxeo developer account, you might as well keep the whole shabang free, right? Head on over to x10hosting.com or forwardhosting.com, and register for an account. These are two of the very few hosting companies that will allow you to run cron jobs for free. Now that you are all setup, let’s get into the design aspect, cron job first.

While cron web interfaces will certainly be different, the underlying principal is the same: cron will wait until the system time matches your job time, and then will execute an action. Most web portals to cron jobs will allow you specify minute, hour, day, month, and weekday.

Let’s assume you have to call your wife every Friday night at 5 pm to let her know that you’re coming straight home from work. We’ll set minute to “00″, the hour to “17″, the weekday to “4″, and the rest to “*”. This will make the cron job execute on Fridays at 1700. You may need to adjust the time on your cron job to account for differences between system time and your time. For example, the x10hosting box on which my cron job runs is set to US central time. For the cron job command, use the Unix command “curl” like so:

curl http://api.voxeo.net/SessionControl/CCXML10.start?tokenid=dd5a1d7f44e97f49856eb6e894c9c669d152e89a571f6201eb3b265045b7a1d2bb52ff8d9856fbbbbbbbbbba\&numdial=5551231234

This command sends an http request to api.voxeo.net for our CCXML 1.0 token. The request also contains the variable “numdial.”

( Please note that the backslash is used in cron to escape the ampersand. This request will not work properly from a browser window )

Cron job interface

Now for the XML part:

<?xml version="1.0" encoding="UTF-8"?>
<ccxml version="1.0" xmlns:voxeo="http://community.voxeo.com/xmlns/ccxml">

<var name="state0" expr="'init'"/>
<var name="callid_out1"/>
<var name="callid_out2"/>
<var name="pin"/>
<var name="holdMusicDlg"/>

<eventprocessor statevariable="state0">

  <transition state="init" event="ccxml.loaded">
    <createcall dest="'tel:+15555555555'" connectionid="callid_out1" callerid="'1112223333'" timeout="'30s'"/>
  </transition>

  <transition state="init" event="connection.connected">
    <assign name="callid_out1" expr="event$.connectionid"/>
    <assign name="state0" expr="'enterpin'"/>
    <dialogstart src="'null://?termdigits=#&text=Press 1 and then pound if you want to dial' + session.values.numdial"&
    type="'application/x-fetchdigits'"/>
  </transition>

  <transition state="enterpin" event="dialog.exit">
    <log expr="'PIN = [' + event$.values.digits + ']‘”/>
    <if cond=”‘1′ != event$.values.digits”>
      <exit/>
    <else/>
      <assign name=”pin” expr=”event$.values.digits”/>
    </if>

    <assign name=”state0″ expr=”‘calling’”/>
    <dialogstart src=”‘holdingPattern.vxml’” type=”‘application/xml+vxml’” namelist=”pin” dialogid=”holdMusicDlg”/>
    <createcall dest=”‘tel:+1′ + session.values.numdial” connectionid=”callid_out2″ callerid=”‘5555555555′”/>

  </transition>

  <transition state=”calling” event=”connection.failed”>
    <assign name=”state0″ expr=”‘callfailed’”/>
    <dialogterminate dialogid=”holdMusicDlg”/>
  </transition>

  <transition state=”callfailed” event=”dialog.exit”>
    <assign name=”state0″ expr=”‘playingCallFailed’”/>
    <dialogstart src=”‘callFailure.vxml’” type=”‘application/xml+vxml’” connectionid=”callid_out1″/>
  </transition>

  <transition state=”playingCallFailed” event=”dialog.exit”>
    <disconnect/>
  </transition>

  <transition state=”calling” event=”connection.connected”>
    <if cond=”event$.connectionid == callid_out1″>
      <exit/>
    <else/>
      <assign name=”state0″ expr=”‘beforeBridging’”/>
      <dialogterminate dialogid=”holdMusicDlg”/>
    </if>
  </transition>

  <transition state=”beforeBridging” event=”dialog.exit”>
    <send name=”‘pause’” target=”session.id” delay=”‘200ms’”/>
  </transition>

  <transition state=”beforeBridging” event=”pause”>
    <assign name=”state0″ expr=”‘bridged’”/>
    <join id1=”callid_out1″ id2=”callid_out2″/>
  </transition>

  <transition event=”error.conference.join”>
    <log expr=”‘*** ERROR DURING JOIN ***’”/>
    <exit/>
  </transition>

  <transition event=”error.*”>
    <log expr=”‘an error has occured (’ + event$.reason + ‘)’”/>

    <voxeo:sendemail to=”‘yourEmail@there.com’”
      from=”‘myApp@here.com’”
      type=”‘debug’”
      body=” ‘generic error detected ! ‘ “/>
    <exit/>
  </transition>
</eventprocessor>
</ccxml>

This application has a fairly simple flow. It calls 1-555-555-5555, and then asks the callee to press 1 and then # to connect to whatever number was passed in via the http request (in this case numdial=5551231234).

That’s it. To make this work for you, you need to change four values:

1. token ID in http request
2. numdial variable in http request
3. “5555555555″ is found twice in the CCXML file - both instances should be changed to your number
4. sendemail “to” and “from” in CCXML file

Good luck with your development - mix in a little MySQL and PHP action to make adding more cron jobs easier.

Till next time,

Jeremy McCall
Voxeo Network Operations

Certified Tech Tip: Alpha-Numeric voice recognition grammars - part two

Tuesday, May 20th, 2008

voicexmlcertifieddeveloper.gif
In our last entry to the tech-tips blog, we detailed the challenges inherent in capturing alphabetical, or alpha-numeric entries from our callers, and detailed several paths for minimizing the chance of mis-recognition when implementing input fields based on these two categories of voice recognition. The long and short of this posting was that IVR developers should refrain from attempting this wherever possible, and to instead try these alternatives:

* Pre-compiled Statistical Language Model grammars
* Leveraging TargusInfo services for advanced recognition accuracy

However, the IVR project requirements dictate what we can, and can’t do as developers, so in some cases, we have to try and whip out a user grammar that takes alpha, or alpha-numeric input. As mentioned in our last blog entry, there are a few things we can do to stack the deck to try and squeeze more accuracy out of these grammars so that we don’t end up with frustrated callers, but the plain truth is that we will never, ever be able to write a grammar that accepts alphabetical characters to be 100% accurate using todays recognition technology. What we will do today is twofold:

(1) Craft an SRGS+SISR subgrammar for alphabetical, and numeric characters

(2) Plug this grammar into a mixed-initiative form dialog that will minimize (but not fully address!), the possibility for mis-recognitions.

Those developers who have the need for such a grammar and dialog within their production-grade applications are advised to take this basic framework as a starting point, and then expand on it by:

(a) Test carefully with a broad range of users, and to fully flesh out alternate utterance values for alphabetic characters

(b) Apply item weighting to specific characters based on the probability of a given character versus another like-sounding character - this will depend greatly on the specific usage of the grammar

(c) Track results by using w3c-compliant utterance recording, and logging all shadow variables, so that these results can be used to further tune and tweak our grammar for maximum accuracy

(d) Consider using n-best post-processing as an additional confirmation step to ensure that the results we receive are indeed accurate

For today’s entry, lets assume that we need to track a three digit zip code, which are prevalent in Canadian locales. Our predefined format for utterance values are “Alpha Digit Alpha”, and luckily, not all alpha characters are applicable: Instead of trying to recognize 26 letters accurately, we only need to recognize 16, which helps a lot!

We won’t dig into the specifics of a mixed-initiative form dialog, as we have already done so in our mixed-initiatve tutorial, but the gist is that this feature of VoiceXML allows us to fill multiple fields with a single utterance, and breaking up each alpha and numeric character into it’s own recognition field greatly cuts down on disambiguation problems that can occur.

For the purposes of brevity, what we have below is a stripped-down version of our fully fleshed-out grammar, but you may download the full grammar, and the mixed-initiative dialog right here, which contains lots more inline notations.

<?xml version= "1.0"?><grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US">

<rule id="canadianZip" scope="public">

<one-of>

<!-- ALL THREE FIELDS FILLED -->

<item>

<item>

<ruleref uri="#alphaRule1"/>

<tag>out.alphaSlot1=rules.alphaRule1.alphaSlot1;</tag>

</item>

<item>

<ruleref uri="#numRule"/>

<tag>out.numSlot=rules.numRule.numSlot;</tag>

</item>

<item>

<ruleref uri="#alphaRule2"/>

<tag>out.alphaSlot2=rules.alphaRule2.alphaSlot2;</tag>

</item>

</item><!-- ONLY TWO FIELDS FILLED -->

<item>

<item>

<ruleref uri="#alphaRule1"/>

<tag>out.alphaSlot1=rules.alphaRule1.alphaSlot1;</tag>

</item>

<item>

<ruleref uri="#numRule"/>

<tag>out.numSlot=rules.numRule.numSlot;</tag>

</item>

</item>

<item>

<item>

<ruleref uri="#numRule"/>

<tag>out.numSlot=rules.numRule.numSlot;</tag>

</item>

<item>

<ruleref uri="#alphaRule2"/>

<tag>out.alphaSlot2=rules.alphaRule2.alphaSlot2;</tag>

</item>

</item>

<item>

<item>

<ruleref uri="#alphaRule1"/>

<tag>out.alphaSlot1=rules.alphaRule1.alphaSlot1;</tag>

</item>

<item>

<ruleref uri="#alphaRule2"/>

<tag>out.alphaSlot2=rules.alphaRule2.alphaSlot2;</tag>

</item>

</item>

<!-- ONLY ONE FIELD FILLED  -->

<item>

<ruleref uri="#alphaRule1"/>

<tag></tag>

</item>

<item>

<ruleref uri="#numRule"/>

<tag>out.numSlot=rules.numRule.numSlot;</tag>

</item>

</one-of>

</rule>

<rule id="alphaRule1" scope="public">

<one-of>

<item weight="1.0">

<one-of>

<item> ex</item>

<item> ax</item>

<item> x </item>

</one-of>

<tag>out.alphaSlot1="X"; </tag>

</item>

</one-of>

</rule>

<rule id="numRule" scope="public">

<one-of>

<item> one <tag>out.numSlot="1"; </tag>  </item>

</one-of>

</rule>

<rule id="alphaRule2" scope="public">

<one-of>

<item weight="1.0">

<one-of>

<item> ay</item>

</one-of>

<tag>out.alphaSlot2="A"; </tag>

</item>

</one-of>

</rule>

</grammar>

In brief, our top-level rule assumes that we can have any of the following entries:

"X1A""X"

"X1"

"XA"

"1"

"1A"

And in the event that we get one or two characters matched in our utterance, the VoiceXML mixed-initiative logic will then take over, and prompt the caller to fill in any “blanks” remaining.

A few things of note about the grammar defined below is that in the event that we receive only a single alpha utterance, we will assume that it is the first character, not the last. Additionally, when we construct a grammar that contains multiple slot returns, it is required that we explicitly define the slot values all the way up the chain: if we didn’t define the “out.[slotname]=rules.[rulename].[subslot]” within the context of the top-level rule, the last slot value would overwrite all others, meaning that we would only get a value for “alphaSlot2″ within the VoiceXML dialog. To illustrate even further, the below snippet for a top-level return would make this a reality:

<item>
<ruleref uri="#alphaRule1"/>

<ruleref uri="#numRule"/>

<ruleref uri="#alphaRule2"/>

</item>

You’ll also see that each possibility for character recognition is specified within the top-level rule, so in the event that we get 1, 2 or 3 character strings, we can pipe the return value back to the VoiceXML, and let the mixed-initiative dialogs then access the sub-rules (alphaRule1/2 and numRule), individually as needed.

We also illustrated in brief how one can define multiple like-sounding utterance values that return the same interpretation value, and defined an for our alphaRule1 entry simply to show how this can be done: The task of taking this framework, and turning it into a grammar that satisfies any given project rests in the hands of you, the capable IVR developer.

=^)

Till next time,

Matthew Henry
Director of Customer Support
Voxeo Corporation

Useful Links

Technorati Tags:
, , , , , , ,

Earth Day Special Project: Project Green Phone

Sunday, April 27th, 2008

This is a guest post from Mark Headd, a voice application developer who was one of the first 10,000 users of our platform, and was originally published on his Vox Populi blog on April 17, 2008.


Earth Day 2008 is fast approaching, so I wanted to try and build something that would help the environment and also be a cool demonstration of telephone applications generally, and the Voxeo Prophecy platform in particular.

I decided to whip up a simple application that would allow a caller to search for E85 and Bio-diesel fuel stations in their state. Some of the specific goals that I had in mind when I got started were:

  • To make use of the Voxeo Prophecy platform, the premiere VoiceXML/CCXML platform for building voice applications (at least in my opinion).
  • To code the application entirely in VoiceXML, CCXML, ECMAScript and PHP (that’s right, no database!).
  • To integrate with SOAP-based web services to obtain data on E85 and Bi-Diesel station locations, and to do other cool stuff like send an SMS message from VoiceXML.
  • To make use of interesting and unique audio files for prompts and to signal specific types of outcomes.

The fruits of one weekend of labor can be downloaded here. To set up and test this application, you will need the following:

  • An account with StrikeIron to use the web services that drive the GreenPhone application.
  • A copy of Voxeo Prophecy.
  • A good headset and microphone (to place test calls using Prophecy).
  • A cell phone (preferably one with a liberal text messaging contract).

Sign Up With StrikeIron:

Create an account with StrikeIron and sign up for the Super Data Pack Web Service. This is a collection of web services that allow for up to 10,000 hits / month at no charge (where are you going to get a better deal than that?). You’ll also want to sign up for the Global SMS Pro Web Service – this is the service that is used to send SMS messages from the GreenPhone application. Note – this service is priced quite differently than the Super Data Pack Web Service – only 10 free hits before you start paying. If you want to use this service for anything more than just testing out how to send an SMS message from Voxeo Prophecy, you’ll need to get your wallet out.

Make note of the user ID (email address) and password used to create your StrikeIron account – these will be needed momentarily.

Download and install Voxeo Prophecy:

Download and install the Voxeo Prophecy software. Follow all of the instructions for installing and obtaining a license – a two-port license (which will support 2 concurrent phone calls) is free. Right now, prophecy only runs on Windows, but a Linux version is in the pipeline.

Download and Configure GreenPhone:

Download the GreenPhone application and extract it to a new directory under c:\{Prophecy install path}\www\. (For example, on my Windows machine I’ve extracted to c:\Program Files\Voxeo\www\GreenPhone\). You don’t have to run the GreenPhone application on the same machine as Prophecy – if you decide to deploy it on another machine, it must support PHP 5 – GreenPhone makes use of the PHP SOAP and SimpleXML extensions.

Once this is complete, navigate to the directory where you just extracted the GreenPhone application files. Go to the directory called “php”, and open the file called common.php. At the top of this file, enter the credentials from your StrikeIron account. Save and close the file.

Creating a Call Route for GreenPhone:

Open the Prophecy Management Console in your web browser (http://127.0.0.1:9995/mc.php) – the default user ID and password are admin/admin. Click on the “Call Routing” option on the left hand menu – this is where you will set up a call route to the GreenPhone application.

Pick one of the numbered route Ids (e.g., Route 1 ID) and make the following changes:

  • Change the route ID to green
  • Change the Route Type to CCXML W3C
  • Change the URL to http://127.0.0.1:9990/{ GreenPhone Install Directory}/greenPhoneStart.xml
  • Scroll to the bottom of the page and click “Save Changes”

Making a test call:

Now that Prophecy is installed, fire up the SIP Phone that it is bundled with – you should see the Prophecy icon in your system tray. Click on it, and select “SIP Phone” from the menu. When the SIP Phone launches, select Options. In the SIP Proxy / Registrar Options section, enter your cell phone number in the Local Username field (e.g., 2125551234). Click OK, and restart your SIP Phone. This last step allows your cell phone number to be delivered as the caller ID (or ANI) on the test call you are about the make, even though your initiating the call from a SIP phone.

GreenPhone is built to use ANI to look up E85 and Bio-Diesel stations in the caller’s home state. We do this by invoking the U.S. Area Code Information Web Service that is part of the StrikeIron Super Data Pack to determine which state a caller is calling from. There are additional web services in the StrikeIron Super Data Pack that we can invoke to locate Bio-Diesel stations and E85 Stations — the methods invoked on these last two services require us to identify the state we want a listing of stations for.

The caller’s ANI is also used to send the details on a particular E85 or Bio-Diesel station via text message to the caller’s phone – so if you enter your cell phone number in the Voxeo SIP Phone as described above, you can get details on a station that may be near you sent directly to your cell phone.

As an aside, you’ll notice that a single phone call can result in up to 4 web service invocations — not really sure if that’s “too many” but there are probably some opportunities for caching that I’ll be discussing in the next couple of posts on this, as I describe in more detail how to interact with web services via Voxeo Prophecy.

Now you are ready to place a test call. When your SIP Phone restarts, go to the field called Dial String and enter “sip:green@127.0.0.1” (without the quotes). Click dial and you are now interacting with the GreenPhone application!

You’ll notice (and hopefully enjoy) the unique sounds I’ve tried to used throughout the application. All of them were obtained from the FreeSound Project and modified to conform to the Prophecy standard for audio files with Audacity.

There are some obvious limitations to how this application currently works, and the VUI clearly needs some refinement (DTMF only at this point).

In the next several posts, I’ll point to this application to discuss examples on how to accomplish things in VoiceXML and CCXML using the Voxeo Prophecy Platform.

Have a happy Earth Day on 4/22!!