Certified Tech Tip: Prophecy 8 SSML in CallXML
Friday, February 22nd, 2008
Well, it’s about that time again, so here we are with another certified tech tip. This week, we’re going to show you how to use some fancy SSML within your CallXML scripts. It’s not something I would consider difficult, but it is certainly useful. Plus, the ladies love a man with extensive knowledge of SSML.
I would like to start off by introducing myself to the readers of the Voxeo Blog. My name is Mike Thompson, I work as one of the senior Supporteons at Voxeo. I have been working full-time with Voxeo Corporation since January, 2006, and have been VoiceXML certified since March of that year.
For those of you who don’t know what SSML is, it stands for Speech Synthesis Markup Language. SSML gives you finer grained control over how your TTS (Text-to-Speech) is read by the engine. For a thorough breakdown of exactly what SSML can do, I suggest you check out the W3C spec here:
Let’s get started, shall we?
The rule of thumb when integrating SSML with CallXML when using the Prophecy 8 IVR platform, is to make sure you wrap it within CDATA (since SSML is not part of the CallXML spec). If you don’t, the browser will start throwing cabbage and various old vegetables at you. Renaissance jokes aside, let’s take a look at our sample script so you can get a grasp on the use of SSML. We know all the good boys and girls declare their XML at the head of the document like so:
<?xml version= “1.0″?>
<callxml version=”3.0″>
<do label=”start”>
<wait value=”2s”/>
<say>
<![CDATA[
<speak>
The sub element is employed to indicate that the text in the alias attribute value replaces the contained text for pronunciation. This allows a document to contain both a spoken and written form. The required alias attribute specifies the string to be spoken instead of the enclosed string.
For example, we can write out C X M L in our script, but using sub, it will be pronounced <sub alias="CallXML">CXML</sub>. Let's move on to prosody pitch now. <break/>
<prosody pitch="high">
I hate what happens to my voice when I drink fifteen cups of coffee. I must admit, I feel like I could run a marathon right now.
</prosody>
<prosody pitch="low">
Let's check out emphasis shall we?
</prosody>
Want to hear a secret? <emphasis level="reduced"> The New England Patriots choked in the Super Bowl. </emphasis>
<break strength="x-strong"/>
I will always be a loyal Dolphins fan. <emphasis level="strong">Go Dolphins!</emphasis>. Now, who wants to hear some prosody rate goodness?
<prosody rate="+70%">
We'll start the bidding at 45 dollars. 45 do I hear 45 dollars? 45 50 do we have 50? 50 dollars anyone? 50 55 dollars do I hear 55 dollars? Anyone? Still standing at 55 dollars.
</prosody>
<prosody rate="-20%">
55 dollars going once. going twice.
</prosody>
<prosody rate="+10%">
Sold to the gentleman in the red sweater.
</prosody>
Goodbye!
<break/>
</speak>
]]>
</say>
</do>
</callxml>
A couple of things to notice with this example…
1) Make sure to wrap your SSML with the speak element, as well as CDATA. Again, this is because SSML is not part of the CallXML spec.
2) When using prosody rate, I used the percentage syntax with + or -. Per the SSML spec, you can also use the following values when setting rate: “x-slow”, “slow”, “medium”, “fast”, “x-fast”, or “default”
That about wraps it up this week. Join us next week for a certified tech tip from Jeff Menkel. Jeff will be showing everyone a new feature in Prophecy 8 which allows the developer to record an entire call at the CCXML 1.0 scope.
Regards,
Mike Thompson
Voxeo Corporation
So what’s going on here? At a high level, you called into our servers where an XML document outlined a series of prompts, collected some information and then sent the result over to a Twitter account for posting. (If you aren’t familiar with Twitter,
Before I start walking through the code, let’s talk about the version of XML I am using. Here at Voxeo we support three different kinds of XML for developing voice applications (as outlined
RSS Feed