Back in the summer of 2008, I gave a presentation at O’Reilly’s Open Source Convention (OSCON) about “Mashing Up Voice and the Web Using Open Source and XML” where I talked primarily about integrating voice with the Identi.ca microblogging service. While I made the slides from that talk available previously, I only made one of my mashup demos available here in this Voxeo Developer’s Corner blog (Demo #1: Is Twitter Down?) So I want to change that and start making some more of the demos available.
In my Demo #2, Listening to Identi.ca, I created a VoiceXML application that does the following:
- Asks the caller if they want to hear:
- the latest Identi.ca message from the people they follow
- the latest reply to them
- the latest public Identi.ca message
- Uses speech recognition to interpret the result
- Retrieves the requested information from Identi.ca
- Relays the information to the caller
Now there is the caveat that this demo is hardcoded to a single identi.ca user (namely me – identi.ca/danyork). You can try it out yourself by calling any of these numbers:
If you would like to try out this code below yourself with your own Identi.ca account, all you need to do is create a free developer account on our Evolution developer portal, create the VoiceXML file and assign it a phone number. (Step-by-step instructions are available.) You also can download a free copy of our Prophecy software, install it on a local system and set up this code there.
With that, let’s jump into the code. The full VoiceXML source code is available down below as something you can copy and paste, but right now I’m going to walk through the pieces of the code.
First we have the standard start of a VoiceXML file and the definition of a variable that is going to be used to store the results of the request to Identi.ca:
<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1">
<var name="MyData"/>
Next we start the <form> element and begin with defining a JavaScript function that is going to retrieve the text we want from the XML data sent back to us from Identi.ca:
<form id="F1">
<script>
<![CDATA[
function GetData(d,t,n) {
return (d.getElementsByTagName(t).item(n).firstChild.data);
}
]]>
</script>
I can’t claim credit for the JavaScript – it was something I found in one of the VoiceXML tutorials we have available. Basically it is searching the received XML for tags of type “t” and then going to the “n”th tag and retrieving the data from there.
Now we start with a field in the VXML form. Note that I use an audio file that I had previously recorded:
<field name="Choice">
<prompt bargein="false">
<audio src="../audio/identicachoice.wav"/>
</prompt>
I could have just as easily used Text-To-Speech (TTS) to do the same thing:
<field name="Choice">
<prompt bargein="false">
Welcome to the listen to identi.ca demo. To hear your
latest message please say "friends". To hear the latest
reply please say "replies". To hear the latest public
message please say "public".
</prompt>
Now I define the “grammar” which is the list of the words that I will accept and that the speech recognition engine will listen for:
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" root="MYRULE">
<rule id="MYRULE">
<one-of>
<item>friends</item>
<item>replies</item>
<item>public</item>
</one-of>
</rule>
</grammar>
To finish off the field, I’m going to catch two error cases where either the caller said nothing or did not say one of the three words in the grammar:
<noinput>
I did not hear anything. Please try again.
<reprompt/>
</noinput>
<nomatch>
I did not recognize that word. Please try again.
<reprompt/>
</nomatch>
</field>
With the <field> defined, I move on to define what happens once acceptable input has been received by using the <filled> element. Note that I am using the Choice name that was defined in the field above.
First I am going to check if the caller said “friends” and if so I am going to use the <data> element to make a web call out to the Identi.ca site. The results of the web call are stored in the MyData variable which is then referenced in the <prompt> element:
<filled namelist="Choice">
<if cond="Choice == 'friends'">
<data name="MyData" src="http://identi.ca/danyork/all/rss?limit=1"/>
<prompt>
Your last notice is from <value expr="GetData(MyData,'dc:creator',0)"/>.
The notice is: <value expr="GetData(MyData,'title',2)"/>.
</prompt>
Note that I am using the previously defined GetData JavaScript function to walk the XML tree twice: first to get the person sending the Identi.ca notice and second to get the contents of the notice. Now to make this work, I did have to look at the XML sent back by Identi.ca and figure out which were the appropriate tags and position numbers to use.
I next do the same thing for ‘replies’ and ‘public’:
<elseif cond="Choice == 'replies'"/>
<data name="MyData" src="http://identi.ca/danyork/replies/rss?limit=1"/>
<prompt>
Your last reply is from <value expr="GetData(MyData,'dc:creator',0)"/>.
The reply is: <value expr="GetData(MyData,'title',2)"/>.
</prompt>
<elseif cond="Choice == 'public'"/>
<data name="MyData" src="http://identi.ca/rss?limit=1"/>
<prompt>
The last public notice is from <value expr="GetData(MyData,'dc:creator',0)"/>.
The notice is: <value expr="GetData(MyData,'title',1)"/>.
</prompt>
</if>
You will note that I explicitly tested for ‘public’ although I really didn’t need to do so. The grammar only allowed three options, so if it was not one of the first two it would naturally be ‘public’. I could have just used an <else/> here.
Finally I just thank the caller with a final prompt and end the various elements to close off the file:
<prompt>
That is all. Thank you for calling in.
</prompt>
</filled>
</form>
</vxml>
FULL SOURCE CODE
For those who want to see the entire source code or copy/paste the code, here it is. You’ll note that I have here the TTS version since you won’t have access to the audio file I made. If you’d like to use a recorded prompt, you can use the code I had above.
Obviously wherever you see “danyork“, you can substitute your Identi.ca user name or that of whomever you want to hear the messages from.
<?xml version="1.0" encoding="UTF-8"?>
<vxml version = "2.1">
<var name="MyData"/>
<form id="F1">
<script>
<![CDATA[
function GetData(d,t,n) {
return (d.getElementsByTagName(t).item(n).firstChild.data);
}
]]>
</script>
<field name="Choice">
<prompt bargein="false">
Welcome to the listen to identi.ca demo. To hear your
latest message please say "friends". To hear the latest
reply please say "replies". To hear the latest public
message please say "public".
</prompt>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" root="MYRULE">
<rule id="MYRULE">
<one-of>
<item>friends</item>
<item>replies</item>
<item>public</item>
</one-of>
</rule>
</grammar>
<noinput>
I did not hear anything. Please try again.
<reprompt/>
</noinput>
<nomatch>
I did not recognize that word. Please try again.
<reprompt/>
</nomatch>
</field>
<filled namelist="Choice">
<if cond="Choice == 'friends'">
<data name="MyData" src="http://identi.ca/danyork/all/rss?limit=1"/>
<prompt>
Your last notice is from <value expr="GetData(MyData,'dc:creator',0)"/>.
The notice is: <value expr="GetData(MyData,'title',2)"/>.
</prompt>
<elseif cond="Choice == 'replies'"/>
<data name="MyData" src="http://identi.ca/danyork/replies/rss?limit=1"/>
<prompt>
Your last reply is from <value expr="GetData(MyData,'dc:creator',0)"/>.
The reply is: <value expr="GetData(MyData,'title',2)"/>.
</prompt>
<elseif cond="Choice == 'public'"/>
<data name="MyData" src="http://identi.ca/rss?limit=1"/>
<prompt>
The last public notice is from <value expr="GetData(MyData,'dc:creator',0)"/>.
The notice is: <value expr="GetData(MyData,'title',1)"/>.
</prompt>
</if>
<prompt>
That is all. Thank you for calling in.
</prompt>
</filled>
</form>
</vxml>
I hope you found this tutorial useful and please feel free to leave your comments, suggestions or questions here. (Including if you think of a better way for me to write my VXML code.)
Meanwhile, as I said before, if you would like to try out this code below yourself with your own Identi.ca account, all you need to do is create a free developer account on our Evolution developer portal, create the VoiceXML file and assign it a phone number. (Step-by-step instructions are available.)
Also, if you extend this app and do something interesting with it (for instance, allowing the caller to choose between different Identi.ca accounts) and would be open to sharing what you’ve done, please feel free to email me. I’d love to post some follow-up posts that show what else you can do with VoiceXML and services like Identi.ca.
P.S. Because Identi.ca uses the same style of RESTful API as Twitter, this script can be modified to work with Twitter by simply changing the web call in the <data> element to be for the Twitter API. If I recall correctly, I also had to figure out what tag name and item number were necessary for the GetData function as the XML data returned was different between Identi.ca and Twitter.
If you found this post interesting or helpful, please consider either subscribing via RSS, following us on Twitter or following us on Identi.ca.
Technorati Tags:
identi.ca, microblogging, voicexml, voxeo, tutorials, voice