Archive for January, 2009

W3C Biometrics Workshop — background

Friday, January 9th, 2009

As I sit here reviewing papers for the upcoming W3C Biometrics Workshop, it occurs to me that I should give some background for this Workshop.

The two standards organizations I’ve had the most involvement with, IETF and W3C, have been adding voice biometrics to their standards in fits and starts for several years now.

MRCPv1, a widely-implemented protocol (API) for interacting with Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) engines, did not contain support for Speaker Identification and Verification (SIV), but one of its extensions did.

Portions of this extension made their way into MRCPv2, the standards-track successor to MRCPv1.

The W3C Voice Browser Working Group briefly considered some SIV markup for VoiceXML 2, but there wasn’t sufficient support at the time.

Now, the Working Group is reviewing SIV features for addition to VoiceXML 3. Although the feature set of a W3C specification is not truly set until the specification reaches the Recommendation Stage, this time there appears to be sufficient interest in adding SIV primitives to the language.

To get more information from the knowledgeable public, W3C is holding a workshop to “identify and prioritize directions for SIV standards work as a means of making SIV more useful in current and emerging markets. ” The workshop will be held in early March at SRI in Menlo Park, California.

Although the paper submission deadline has passed, if you were unaware of this workshop and are dying to attend, please email member-siv-submit@w3.org as described in the Call for Participation.


If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Why VoiceXML 3 is not just VoiceXML 2.2

Friday, January 9th, 2009

The first Working Draft of VoiceXML 3 has finally been published, which presents a wonderful opportunity to explain the work the group has been doing for the past few years.

Why VoiceXML 3?

As with many programming languages, future versions are expected simultaneously to provide new features and to be simpler to use.

VoiceXML 3 - is precisely designed - is more extensible - contains new features

I often describe the plan for VoiceXML 3 by analogy to the change from Perl 4 to Perl 5. Perl 4 was feature rich but bloated. The designers of Perl 5 analyzed Perl to determine its “core” and modularized the rest in such a way that the combination of the core and several modules reproduced almost all of the functionality of Perl 4. What’s even more amazing was that the syntax of Perl 5 for the most common use cases was virtually unchanged, with only syntactic edge cases needing to change. By rebuilding Perl in this manner, it became vastly more extensible while largely retaining its existing functionality and syntax.

The goals for VoiceXML 3 are similar.

VoiceXML 3 began with the functionality of VoiceXML 2. This functionality was split up into logical modules of related functionality. Each module is now being defined in detail, in two pieces: syntax and semantics. The syntax of the module is similar to the syntax for corresponding capabilities in VoiceXML 2, with the functionality and event behavior of the syntax defined in the semantics portion. These modular pieces are collected into profiles that essentially are complete languages.

So VoiceXML 3 now consists of: - a framework for developing profiles from modules - an XML-based eventing system - an eventing system for the semantic descriptions associated with the syntax of each module - several modules, including new audio control capabilities - two module definitions, one emulating VoiceXML 2.1 and one combining the range of functionality available in VoiceXML 3.0

One nice thing about the new structure is that it is now possible for new modules and profiles to be defined, which should play well with the rest as long as they are defined using the framework in the document.

This is of course a first Working Draft, and thus many changes are still possible. If you are interested in participating in the continued development of VoiceXML, please contact any of us in the Working Group and we’ll help you join.

For more info, check out: The requirements document The specification


If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.