Certified Tech Tip: Using SISR-formatted grammar returns with Prophecy 8
January 28th, 2008 by matt
I am happy to announce a new semi-regular addition to the Voxeo blog, where the Voxeo Support team will be adding VoiceXML, CallXML, and CCXML tips, tricks, and best practices for our developers, which we will christen as “Certified Tech Tips”. The name has a nice ring to it and all, but this isn’t just for show: 100% of the technical support team are certified VoiceXML developers, and we are pretty proud of being the only provider who holds these standards.
As we devise some really inventive means of achieving project goals & cool functionality when coding in the framework of these various IVR markups, we thought that we might share some of these tips to our readers of the Voxeo Blog.
For those who haven’t interacted with the support team yet, a bit of introduction is in order. My name is Matthew Henry, and serve as the Director of Customer Support here at Voxeo. I have been with the company since it’s inception (way back in the 20th Century), and have been lucky enough to work with a sizable number of really talented IVR developers and engineers, which has allowed me to learn a lot, and has also allowed me to build up a respectable code library for all things IVR. And now, it’s time for some payback.
=^)
As our maiden posting to the Voxeo blog, we will cover the topic of Semantic Interpretation for Speech Recognition-formatted grammar returns when using the Prophecy 8 software. A lot of folks are used to using plain-old Nuance GSL grammars due to it’s ease of use and concise markup, but the drawback of using this approach is pretty fundamental: As GSL is Nuance-specific, it isn’t guaranteed that every provider will support it. And those of us who have written complex grammars know that porting a grammar can be a tedious job to take on. For this reason, we always suggest that folks stick with a W3C standard when writing grammars, that being using the SRGS XML-based grammar format that leverages the SISR syntax to populate our grammar interpretations back to the VoiceXML dialog. Most of the documentation on our site references using the Nuance-specific return formatting, and today we will show you what a 100% w3c compliant grammar looks like.
To start things off, let’s take a look at some GSL, and some SRGS with Nuance-specific returns for the sake of comparison:
Simple GSL
MYRULENAME [
[utterance] {<mySlotName “my return value”>}
]
Simple SRGS with Nuance-returns
<?xml version= "1.0"?>
<grammar xmlns=”http://www.w3.org/2001/06/grammar” xml:lang=”en-US”
root = “MYRULENAME”>
<rule id=”MYRULENAME”>
<one-of>
<item>
utterance
<tag> <![CDATA[ <mySlotName "my return value"> ]]> </tag>
</item>
</one-of>
</rule>
</grammar>
Simple SRGS with SISR returns
<?xml version= "1.0"?>
<grammar xmlns=”http://www.w3.org/2001/06/grammar” xml:lang=”en-US”
root = “MYRULENAME”>
<rule id=”MYRULENAME”>
<one-of>
<item>
utterance
<tag>$.mySlotName = “my return value”</tag>
</item>
</one-of>
</rule>
</grammar>
The differences in syntax are fairly self-evident in these cases. In the case of SISR, the “$.” prefix allows us to specify any slotname that we will return to our VoiceXML dialog, and specifying a quoted interpretation value preceded by an ‘equals’ sign links the value to this slot.
In addition, we can also specify a “generic” return where no slotname is specified (which comes in handy for subgrammars) by putting $=”my return value” within the . If we want to get really fancy, we can even specify multiple slots to return back to the dialog by inserting a “;” delimiter between the slot/interpretation pairing. A sample multislot return with an “anonymous” slot also defined might look something like this:
<item>
utterance
<tag>
$ = “my anonymous slot value”;
$.mySlotName1=”my slot 1 return”; $.mySlotName2=”my slot 2 return”;
</tag>
</item>
As you can see, the SISR returns are much more concise, easy to read, and much more lightweight than Nuance-specific returns. And once you write a grammar using SIRGS and SISR, then any Certified Compliant VoiceXML platform will run these grammars without any porting at all being required.
If you found this posting useful, then let us know! Mayhap we will dig deeper into this the next time, and whip out some more complex subgrammars to better illustrate the usage of SISR formatting within your IVR applications.
Till next time!
~Matthew Henry
RSS Feed
January 29th, 2008 at 10:41 pm
Matthew,
Great post, I think compliance to W3C’s standards is a must. That said, you use the old SISR syntax for specifying return values and accessing referenced rules values.
Also, is it really true that grammars written using SISR will run on all Certified Compliant VoiceXML platform? I may be mistaken, but SISR is not in the scope of the VoiceXML Forum’s certification program.
In my experience, the biggest challenge in porting an application from one platform to another is porting the grammars, tweaking the confidence/rejection thresholds, phonetic dictionaries, etc. There are other issues, as well, but from a performance point of view, grammars are the most important.
What do you think?
January 30th, 2008 at 11:15 am
Dominique,
You raise some very valid points here, specifically in regards to certified-compliant platforms. You are correct in that 100% compliance to the SISR specification isn’t required to be a cerified-complaint platform, yet partial compliance to the SRGS specification is at least hinted at. The grammar formats used in the 800+ indivudual tests for compliance use SRGS formatted grammars, but specific adherance to the SISR specification isn’t really tested.
Click any of the links in the Voxeo’s CTR report to get a feel for exactly what is, and isn’t tested in terms of grammar formats; the test numbers listed will allow you to view the VXML and grammars that are used in the compliance test suite:
http://www.voicexml.org/platform_certification/voxeo2/voxeo_prophecy_ctr.html
Lastly, I wholeheartedly agree with you in regards to grammar porting being a very important part of any IVR application (dialog design is the only thing more important, in my estimation), and choosing the right format before you start crafting the utterances & interpretations and subgrammar rules is of the utmost importance.
New Tip/Trick: The Prophecy plaform natively translates all grammars into JSGF format before interpretation occurs. If you want a fast and effective translation utility to switch your GSL/SRGS into the JSGF format, just run the application with the debugger open, and then check the logs carefully, as the translated JSGF grammar is listed right there in the logstream!
=^)
~Matt
January 30th, 2008 at 7:57 pm
Thanks for these clarifications!
From you tip/trick, I infer that the platform translates all grammars to JSGF format for your own ASR engine. Do you do the same for Nuance’s one? I guess not, right? Can one use SRGS grammars with SISR tags when using the Nuance ASR? Converting all GSL semantic tags to SISR is certainly feasible, but the converse is not.
Also, do you intend to support the new version of the SISR spec in a near future?
By the way, I installed Prophecy 7.0 last year (the free version) and it worked out of the box. I can’t say the same for most of the other platforms I use on a daily basis. Nice job, guys!
January 30th, 2008 at 10:50 pm
Dominique,
You are indeed correct in that we convert to JSGF as the native format for our own ASR engine. If you are running on Nuance we will convert stuff to Nuance’s native format as needed allowing you to use SISR there as well without worrying about what the ASR engine is that is under the covers. This also use of things like GSL on Nuance 9 where they no longer support the “legacy” grammar format.
As for the version of SISR, we actually support both the older $ syntax as shown in Matt’s posting as well as the newer “out” syntax from the W3C Recommendation version. You should be able to just use either version in your code code and we will attempt to process it correctly. If you have problems with this please let us know as it’s a bug and we would love to fix it
Thanks for the kind comments and be sure to let us know if there is anything else we can do to help out!
Best regards,
RJ
February 1st, 2008 at 3:41 pm
[...] blog is off to a great start already with Matt Henry’s initial post, “Certified Tech Tip: Using SISR-formatted grammars with Prophecy 8“, where he demonstrates what SISR-formatted grammars are all about. The post attracted some [...]