Archive for February, 2008

Certified Tech Tip: Prophecy 8 SSML in CallXML

Friday, February 22nd, 2008

voicexmlcertifieddeveloper.gif

Well, it’s about that time again, so here we are with another certified tech tip. This week, we’re going to show you how to use some fancy SSML within your CallXML scripts. It’s not something I would consider difficult, but it is certainly useful. Plus, the ladies love a man with extensive knowledge of SSML.

I would like to start off by introducing myself to the readers of the Voxeo Blog.  My name is Mike Thompson, I work as one of the senior Supporteons at Voxeo.  I have been working full-time with Voxeo Corporation since January, 2006, and have been VoiceXML certified since March of that year.

For those of you who don’t know what SSML is, it stands for Speech Synthesis Markup Language. SSML gives you finer grained control over how your TTS (Text-to-Speech) is read by the engine. For a thorough breakdown of exactly what SSML can do, I suggest you check out the W3C spec here:

SSML Spec

Let’s get started, shall we?

The rule of thumb when integrating SSML with CallXML when using the Prophecy 8 IVR platform, is to make sure you wrap it within CDATA (since SSML is not part of the CallXML spec). If you don’t, the browser will start throwing cabbage and various old vegetables at you. Renaissance jokes aside, let’s take a look at our sample script so you can get a grasp on the use of SSML. We know all the good boys and girls declare their XML at the head of the document like so:

<?xml version= “1.0″?>

<callxml version=”3.0″>

<do label=”start”>

   <wait value=”2s”/>

   <say>

     <![CDATA[  

     <speak>

      The sub element is employed to indicate that the text in the alias attribute value replaces the contained text for pronunciation. This allows a document to contain both a spoken and written form. The required alias attribute specifies the string to be spoken instead of the enclosed string.

      For example, we can write out C X M L in our script, but using sub, it will be pronounced <sub alias="CallXML">CXML</sub>. Let's move on to prosody pitch now. <break/>

      <prosody pitch="high"> I hate what happens to my voice when I drink fifteen cups of coffee. I must admit, I feel like I could run a marathon right now. </prosody>

     <prosody pitch="low"> Let's check out emphasis shall we? </prosody>

     Want to hear a secret? <emphasis level="reduced"> The New England Patriots choked in the Super Bowl. </emphasis>

     <break strength="x-strong"/>

     I will always be a loyal Dolphins fan. <emphasis level="strong">Go Dolphins!</emphasis>. Now, who wants to hear some prosody rate goodness?

     <prosody rate="+70%"> We'll start the bidding at 45 dollars. 45 do I hear 45 dollars? 45 50 do we have 50? 50 dollars anyone? 50 55 dollars do I hear 55 dollars? Anyone? Still standing at 55 dollars. </prosody>

     <prosody rate="-20%"> 55 dollars going once. going twice. </prosody>

     <prosody rate="+10%"> Sold to the gentleman in the red sweater. </prosody>

     Goodbye!

     <break/>

     </speak>

     ]]>

   </say>

</do>

</callxml>

A couple of things to notice with this example…

1) Make sure to wrap your SSML with the speak element, as well as CDATA.  Again, this is because SSML is not part of the CallXML spec.

2) When using prosody rate, I used the percentage syntax with + or -.  Per the SSML spec, you can also use the following values when setting rate:  “x-slow”, “slow”, “medium”, “fast”, “x-fast”, or “default” 

That about wraps it up this week.  Join us next week for a certified tech tip from Jeff Menkel.  Jeff will be showing everyone a new feature in Prophecy 8 which allows the developer to record an entire call at the CCXML 1.0 scope.

Regards, Mike Thompson Voxeo Corporation


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Certified Tech Tip: Multi-slot SISR subgrammars with Prophecy 8

Monday, February 11th, 2008

voicexmlcertifieddeveloper.gif

For our last certified tech-tip, we explored the older SISR-formatted returns that one can use when designing a recognition grammar for a VoiceXML application. For this week, we will tackle two different things related to SISR grammars when using the Prophecy 8 software:1 – How to use the newer SISR “.out” grammar return syntax2 – How we can craft a subgrammar returning multiple slot valuesIt would seem as if the second item is pretty elementary, but it does bear a little bit of illustration, especially in terms of how the values in the sub-rules will bubble up to the top-level return. For the sake of simplicity, we will do a much-simplified month/day grammar that contains a single entry. One you grasp the syntax, you can easily flesh this out more fully to include all possible months & days, or even overhaul it into a first-name & last-name grammar.Let’s take a peek at the grammar file itself, and then look at the relevant working parts: <?xml version= “1.0″?>

<!DOCTYPE grammar PUBLIC “-//W3C//DTD GRAMMAR 1.0//EN” “http://www.w3.org/TR/speech-grammar/grammar.dtd”>

<grammar mode=”voice” xmlns=”http://www.w3.org/2001/06/grammar” xml:lang=”en-US” version=”1.0″ root=”TOPLEVEL” tag-format=”semantics/1.0″>

 <rule id=”TOPLEVEL”>

  <one-of>

   <item>

   <item>

    <ruleref uri=”#MONTH”/> <tag>out.monthslot=rules.MONTH.monthsubslot;</tag>

   </item>

   <item>

    <ruleref uri=”#DAY”/> <tag>out.dayslot=rules.DAY.daysubslot;</tag>

   </item>

    <tag>out.yearslot=”2008″;</tag>

   </item>

  </one-of>

 </rule>

 <rule id=”MONTH”>

  <one-of>

   <item> january <tag> out.monthsubslot=”January “;</tag> </item>

  </one-of>

 </rule>

 <rule id=”DAY”>

  <one-of>

   <item> first <tag> out.daysubslot=”first”;</tag> </item>

  </one-of>

 </rule>

</grammar> The grammar above consists of two sub-rules titled “MONTH” and “DATE”, and we have a single top-level rule titled “TOPLEVEL”. The sub-rules each specify a month and day slot respectively, and these slots will bubble up to the to-level when we invoke the syntax that we have below:

<ruleref uri=”#SUBRULE”> <tag>out.slotname.SUBRULENAME.subslot;</tag>

And we then reference these various slots within the VoiceXML as follows:

<log expr=”‘*** SLOT RESULT = ‘ + lastresult$.interpretation.slotname”/>

We also threw in a quick example of using the “out” syntax in a more generic manner for our “yearslot” value. In this case, it simply allows us to return a year value back to the VoiceXML from the top-level rule as opposed to having to reference the sub-rule values.This is added in to show how a “flat-file” non subgrammar can return a slot value using the newer SISR syntax that follows this format:

<tag>out.slotname;</tag>

So when using our month/day grammar above, one might still be unclear on how we get at all these slot values within the VoiceXML dialog. Once recognition has occurred, we would specify something like this:

<log expr=”‘*** YEAR RESULT = ‘ + lastresult$.interpretation.yearslot”/>

<log expr=”‘*** MONTH RESULT = ‘ + lastresult$.interpretation.monthslot”/>

<log expr=”‘*** DAY RESULT = ‘ + lastresult$.interpretation.dayslot”/>

Eventually, we will get around to creating some fully fleshed-out additions to our VoiceXML documentation on the subject of SISR grammars, but until then, we will post any cool tricks and tips here for the edification of our developers. As always, if heres anyone who’d like to see a posting, or techtip on a particular subject, just drop us a line, and we would be happy to accommodate.Next TechTip: Using SSML markup within a CallXML 3.0 application. Stay tuned to the blog; this next one is really cool.

~Matthew Henry


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Voice Mashups with Twitter, part 1: Who will win the 2008 SuperBowl? (A mashup in CallXML.)

Friday, February 1st, 2008

Who will win the 2008 SuperBowl this weekend? The Patriots or the Giants? And by how many points will they win?

With those two questions in mind, please call now into one of these numbers using either your regular phone, Skype or a SIP phone:

Once you have answered those two questions, check out:

http://twitter.com/superbowlguess

Ta da… you have just participated in a voice “mashup” between our services and the Twitter micro-blogging platform!

callxmltwitter.jpgSo what’s going on here? At a high level, you called into our servers where an XML document outlined a series of prompts, collected some information and then sent the result over to a Twitter account for posting. (If you aren’t familiar with Twitter, here’s a post I wrote about it.)

There’s obviously a bit more to it (see our Quick Start guide) and over the next few weeks I’m going to explain both what I did and perhaps more importantly how it can be improved! In this part 1 you are seeing a basic call flow with Text-To-Speech prompts. While TTS is fast and, as you will see, extremely easy to create, the voice does sound… well… computer-generated! (Duh!) So in the next installment in this series I’ll talk about adding recorded prompts (and how incredibly simple it is to do through our platform). Future parts to the series will cover such things as better error handling, moving the XML over into VoiceXML/CCXML, adding other content to the output to Twitter, etc.

Naturally by the time this series is done the SuperBowl will long be over and the <deleted> will of course be victorious! However, we’ll just have to come up with some other event for the final script, eh?

Now, if you are impatient and don’t want to wait, your best bet is to head on over to our excellent CallXML documentation to learn what you need to do. Of course, you should also sign up for a free developer account on our Evolution developer site where you can create all of these applications and whatever other apps you can think of related to voice.

thexmls.jpgBefore I start walking through the code, let’s talk about the version of XML I am using. Here at Voxeo we support three different kinds of XML for developing voice applications (as outlined on our Choosing a Platform page): our own CallXML and then the VoiceXML and CCXML standards of the W3C. We created our CallXML first and then have been and continue to be very involved with the development of VoiceXML and CCXML. (For instance, our CTO RJ Auburn chairs the CCXML Working Group within the W3C and Dan Burnett is very active within those groups as well.) We are huge supporters of open standards and are extremely pleased to be leading the industry in terms of compliance with VoiceXML and CCXML standards. As part of this series, I’ll take you on a tour of how those specifications work.

To keep this first post simple, though, I’ve written this first mashup example in CallXML because the code is easier to understand as we walk through it. In a future part, you’ll see how this looks in VoiceXML.

But enough already, let’s dive into the code…

(more…)


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.