Archive for the ‘SSML’ Category

Speech Synthesis Markup Language (SSML) 1.1 Approved by W3C as a Recommendation

Tuesday, September 7th, 2010

SSML.jpgToday we were very pleased to see the W3C announced that Speech Synthesis Markup Language (SSML) version 1.1 was approved as a Recommendation. The specification is available at:

http://www.w3.org/TR/speech-synthesis11/

SSML is used to control how text is rendered as human-like speech. It includes elements for describing the voice, pitch, speed, and other characteristics of human speech needed to ensure proper output prosody and pronunciation.

Requirements for enhancing SSML 1.0 were collected during workshops held in China, Greece, and India. The new SSML 1.1 W3C recommendation enhances SSML 1.0 to provide better support for a broader set of natural (human) languages.

In particular, SSML 1.1 supports

  • a new registry for pronunciation alphabets that describe the pronunciation of words and phrases. Developers use pronunciation alphabets to describe precisely the pronunciations of words and phrases. An example is pinyin, a common way of writing pronunciations for Mandarin Chinese.
  • the Pronunciation Lexicon Specification (PLS) to allow for standardized independent collections of pronunciation information that could be also be used by speech recognition engines. (more info about PLS)
  • finer author control over voice selection and behavior upon encountering unexpected language content.
  • better token delimiting for languages that (1) do not use white space as a token boundary identifier, such as Chinese, Thai, and Japanese, (2) that use white space for syllable segmentation, such as Vietnamese, and (3) that use white space for other purposes, such as Urdu.

Much of the work on this specification took place in China, a country with languages that are quite different from European and American languages. This is the first W3C recommendation in which Asians played a major role.

When published, the final version can be found at http://www.w3.org/TR/speech-synthesis11/.

Voxeo’s own Dan Burnett was a co-editor of the SSML 1.1 specification and contributed this to the W3C’s Testimonials page:

SSML is an important part of the overall ecosystem of W3C standards enabling speech across a variety of applications. SSML in particular provides a key way to render richer, more natural sounding speech. We are particularly pleased that SSML 1.1 provides advancements in several key areas, including support for Asian and Eastern European languages as well as improved audio controls for authors. The headway in the Recommendation is the result of the work of the dedicated individuals and companies around the world who value the importance of standards work and support the W3C Voice Browser Working Group. Voxeo is very proud to have been involved in this significant global accomplishment.

We’re pleased to see SSML 1.1 reaching this milestone and congratulate all involved.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Pronunciation Lexicon Specification reaches Recommendation Status

Thursday, October 16th, 2008

On Tuesday W3C released the Recommendation for the Pronunciation Lexicon Specification (PLS). “Recommendation” is the final step in the W3C standards process.

This specification defines a new markup language that is used to represent pronunciation dictionaries.
In it, written words would have one or more pronunciations defined for them. An SSML document could reference a PLS document to indicate how certain words should be pronounced.
An SRGS (W3C grammar format) document could reference a PLS document to indicate what pronunciations to listen for to match certain words.

Here is an example:

<lexicon version="1.0"  alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>judgment</grapheme>
    <grapheme>judgement</grapheme>
    <phoneme>ˈdʒʌdʒ.mənt</phoneme>
  </lexeme>
  <lexeme>
    <grapheme>fiancé</grapheme>
    <grapheme>fiance</grapheme>
    <phoneme>fiˈɒns.eɪ</phoneme>
    <phoneme>ˌfiː.ɑːnˈseɪ</phoneme>
  </lexeme>
</lexicon>

In this example there are two spellings for judgement and two for fiance. For each word there is a pronunciation (in <phoneme>) written in the International Phonetic Alphabet.

For more info on the specification, see the press release.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Greetings from Dan Burnett

Wednesday, December 12th, 2007

Hi, I’m Dan Burnett. I’ll be posting here occasionally about the speech-related standards in W3C and IETF.

I’m an editor of VoiceXML 2.0/2.1, SSML 1.0/1.1, and MRCPv2, an author of EMMA 1.0, PLS 1.0, SCXML 1.0, and the forthcoming VoiceXML 3, and a contributor to almost every other specification from the Voice Browser and Multi-modal Working Groups.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.