Archive for the ‘MRCP’ Category

New Internet-Draft of MRCPv2 now available for comments

Tuesday, September 8th, 2009

As I’ve written about previously, the Media Resource Control Protocol (MRCP) is currently undergoing revision within the IETF to arrive at a new “MRCPv2″. Voxeo’s Dan Burnett has been editing the draft specification to incorporate the latest rounds of comments and last month released the 20th revision of the Internet-Draft:

http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2

If you aren’t familiar with MRCP, it’s a protocol that allows products such as our Prophecy platform to easily interoperate with Automatic Speech Recognition (ASR) or Text-To-Speech (TTS) engines. You can think of it like this:

MRCP-simple.jpg

With MRCP, your application platform can connect to any “MRCP-compliant” speech engine. It’s an open standard that we certainly like because it unlocks our platform and lets you use any of the great number of speech engines supported by Prophecy. We’ve also had customers approach us in the past about using special speech engines – and the open interface of MRCP provides the way in which this can happen.

In any event, MRCPv2 is moving closer to completion – if you have any comments about the latest draft, now is a really good time to send them in to the editors.


If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


W3C Biometrics Workshop — background

Friday, January 9th, 2009

As I sit here reviewing papers for the upcoming W3C Biometrics Workshop, it occurs to me that I should give some background for this Workshop.

The two standards organizations I’ve had the most involvement with, IETF and W3C, have been adding voice biometrics to their standards in fits and starts for several years now.

MRCPv1, a widely-implemented protocol (API) for interacting with Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) engines, did not contain support for Speaker Identification and Verification (SIV), but one of its extensions did.

Portions of this extension made their way into MRCPv2, the standards-track successor to MRCPv1.

The W3C Voice Browser Working Group briefly considered some SIV markup for VoiceXML 2, but there wasn’t sufficient support at the time.

Now, the Working Group is reviewing SIV features for addition to VoiceXML 3. Although the feature set of a W3C specification is not truly set until the specification reaches the Recommendation Stage, this time there appears to be sufficient interest in adding SIV primitives to the language.

To get more information from the knowledgeable public, W3C is holding a workshop to “identify and prioritize directions for SIV standards work as a means of making SIV more useful in current and emerging markets. ” The workshop will be held in early March at SRI in Menlo Park, California.

Although the paper submission deadline has passed, if you were unaware of this workshop and are dying to attend, please email member-siv-submit@w3.org as described in the Call for Participation.


If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


New revision of MRCPv2 submitted – allows interop with different ASR/TTS engines

Thursday, November 6th, 2008

ietflogo-2.jpgWith IETF 73 coming up shortly in Minneapolis, those of us here in Voxeo were very busy last week getting our Internet-Drafts updated in time for Monday’s submission deadline. One of the major pieces of work was done by Dan Burnett with his new revision of the Media Resource Control Protocol Version 2 (MRCPv2) draft.

MRCP is actually a fascinating protocol to me (okay, admittedly, I’m a standards geek) in that it provides an open standard that allows a system to very easily interoperate with different “media processing resources” such as Automatic Speech Recognition (ASR) or Text-To-Speech (TTS) engines. This is how, for instance, our Prophecy product is able to easily use different ASR or TTS engines. In a very simplified view, it looks something like this:

MRCP-simple.jpg

where the “MRCP Client” is, in our case, Prophecy. Now the cool part about this is that if you need a specific ASR engine for a task, if you can find an “MRCP-compliant” engine it should be able to easily interoperate with Prophecy. Say, for instance, that you needed speech rec for a language we didn’t support, a special TTS engine or something like that.

Anyway, the new draft of MRCPv2 is out there and goes into this in an extraordinary amount of detail. If you do have any comments, by the way, Dan Burnett is open to hearing them (his email address is at the end of the draft).


P.S. If you’ve used our Realtime Debugger or our Prophecy Log Search feature inside of Evolution, you’ve no doubt seen a bunch of messages related to MRCP – this is all part of the communication between our main execution environment within Prophecy and the various ASR and TTS resources being used to execute your application.

Technorati Tags: , , , , ,


If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Video: Interview with Dan Burnett on being named 2008 Speech Luminary as “Man of Standards”

Monday, September 8th, 2008

At SpeechTEK in New York City a few weeks ago, our own Dan Burnett was recognized by Speech Tech Magazine as one of the “2008 Speech Luminaries” for all his years of work on industry standards relating to speech. We were delighted for Dan to receive the (well-deserved!) recognition and I had a chance to record a brief video interview with Dan at SpeechTEK:

As Dan mentions, he is Director of Speech Technologies in our Office of the CTO (OCTO) reporting in to our CTO, RJ Auburn, and is responsible for looking at how to constantly improve our speech recognition technology and also ensure it is compliant with standards.

Congratulations, Dan, on the recognition by Speech Technology Magazine!


P.S. And yes, for those following along at home, Dan Burnett and I were both hired into the OCTO at about the same time… we thought about instituting a rule where all new OCTO employees had to be named “Dan”, but thankfully that rule was ignored with the recent excellent addition of Wei Chen!

Technorati Tags: , , , , , , , ,


If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Greetings from Dan Burnett

Wednesday, December 12th, 2007

Hi, I’m Dan Burnett. I’ll be posting here occasionally about the speech-related standards in W3C and IETF.

I’m an editor of VoiceXML 2.0/2.1, SSML 1.0/1.1, and MRCPv2, an author of EMMA 1.0, PLS 1.0, SCXML 1.0, and the forthcoming VoiceXML 3, and a contributor to almost every other specification from the Voice Browser and Multi-modal Working Groups.


If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.