Archive for the ‘Java’ Category

Java Media Control: Using MediaMixer

Monday, May 9th, 2011

Besides using MediaGroup to provide multimedia functions, another common use of media server is multi-party conference calls. MediaMixer is the object in Java Media Control that provides sound mixing capability.

Overview

Similar to MediaGroup, a MediaMixer is a Joinable MediaObject. To connect an endpoint to MediaMixer, simply joining the endpoint’s NetworkConnection to the MediaMixer object, as illustrated in the following diagram from JSR 309 specification.

NC1.join(DUPLEX, theMediaMixer);
NC2.join(DUPLEX, theMediaMixer);
NC3.join(DUPLEX, theMediaMixer);

Sample Application

Here is a simple conference application. When a call comes in, the application asks the caller to input a conference number (max 9 digits). Based on the conference number, the application either create or locate a MediaMixer. Then join the caller’s NetworkConnection to the MediaMixer.

The following is a simplified sequence diagram that shows the interaction among different objects.

Participant Object

The Participant is an abstraction that capture all the signaling and media states and operations on the per caller (conference participant) basis. This is a design pattern I often use in SIP Servlet and Java Media Control based applications.

When an initial SIP INVITE message comes, the MainServlet creates a new Participant object. During the construction, Participant initializes a MediaSession and create a NetworkConnection and a MediaGroup from the MediaSession. Then Participant uses SdpPortManager of the NetworkConnection to negotiate the SDP with the client. See SDP Negotiation for more information about the SdpPortManager.

Collect Conference ID

Once the call is setup, Participant asks the caller to input the conference ID. Instead of using Player to ask the question and use SignalDetector to collect input, I use the PROMPT in the SignalDetector to ask and collect in a single receiveSignal call, as shown below.

 SignalDetector detector = _mg.getSignalDetector();
 Parameters options = _factory.createParameters();
 options.put(SignalDetector.INTER_SIG_TIMEOUT, 5000);
 options.put(SignalDetector.PROMPT, URI.create("data:" + ANNOUCEMENT));
 detector.receiveSignals(9, SignalDetector.NO_PATTERN, RTC.NO_RTC, options);

I also set INTER_SIG_TIMEOUT to 5 seconds. This allows SignalDetector to collect any conference ID that is less 9 digits long when the user stops inputting for 5 seconds. Otherwise, SignalDetector will keep waiting until the user inputs 9 digits.

Join the Conference

Once the conference ID is collected, Participant calls MixerManager to create a MediaMixer based on the ID. The MixerManager maintains a mapping between ID and MediaMixer and only creates a MediaMixer when needed.

 public synchronized MediaMixer createMixer(Configuration config, Parameters options, String key) throws MsControlException {
   MediaMixer mixer = _mixers.get(key);
   if (mixer == null) {
      MediaSession session = _factory.createMediaSession();
      mixer = session.createMediaMixer(config, options);
      _mixers.put(key, mixer);
    }
    return mixer;
 }

Once Participant obtains the MediaMixer, it will join its NetworkConnection to the MediaMixer.

  public void join(String id) throws MsControlException {
    _mixer = _mgr.createMixer(MediaMixer.AUDIO, Parameters.NO_PARAMETER, id);
    _id = id;
    _nc.join(Direction.DUPLEX, _mixer);
  }

Leave the Conference

When the caller hangs up the call, SIP BYE is received by the MainServlet. Participant, retrieved from the SipSession, unjoin its NetworkConnection from the MediaMixer.

  public void unjoin() throws MsControlException {
    if (_mixer != null) {
      _mixer.unjoin(_nc);
      _ms.release();
      _mgr.removeMixer(_id);
    }
  }

The MixerManager only removes the MediaMixer when no Participant is joined.

  public synchronized MediaMixer removeMixer(String key) throws MsControlException {
    MediaMixer mixer = getMixer(key);
    if (mixer.getJoinees().length == 0) {
      _mixers.remove(key);
      mixer.release();
      mixer.getMediaSession().release();
      return mixer;
    }
    return null;
  }

SipListener

I implemented a SipListener as both SipSessionListener and SipErrorListener to handle the following events.

  • SipErrorListener.noAckRecieved. Because MediaSession and related resources are created when INVITE is created, we need clean it up when call setup is failed because of no ACK.
  • SipSessionListener.sessionDestroyed. Similar, we need clean up  the MediaSession when the SipSession is destroyed.

Test Sample

As other samples, simply download the WAR and deployed it under <prism>/apps directory. You can use two softphones dialing into the server to form a two party conference call. Please note Voxeo Prism free trial only comes with 2 media ports. Please contact Voxeo Support to request additional licenses.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Java Media Control API: Using a MediaGroup

Friday, April 29th, 2011

Upon the completion of SDP negotiation, the media server essentially gets a media pipe to the client, represented as the NetworkConnection. Now, to have some fun, we need to connect that pipe to a multimedia studio that can play, record, and recognize different sound.

MediaGroup

This multimedia studio is called MediaGroup in Java Media Control API.  MediaGroup contains four Resources — Player, Recorder, SignalDetector, and SignalGenerator. Once a MediaGroup is connected to a NetworkConnection, you can use these Resources to, e.g. play music to the client.

MediaGroup can be created from a MediaSession based on a Configuration and, optionally, Parameters.

MediaSession.createMediaGroup(Configuration);

The Configuration indicates to the media server what Resources that the application might be using (or not using). Thus the media server can allocate appropriate resources for this MediaGroup. Typically you will use one of the pre-defined Configurations

To use the MediaGroup, simply joining the NetworkConnection with the MediaGroup. (See Media Object Composition about join concept.) Now you can provide multimedia functions for the client by using different Resource in the MediaGroup.

NetworkConnection nc = ...
MediaGroup mg = session.createMediaGroup(MediaGroup.PLAYER_RECORDER_SIGNALDETECTOR);
nc.join(mg);
mg.getPlayer().play(URI.create("http://myserver.com/mysong.wav"), RTC.NO_RTC, Parameters.NO_PARAMETER);

Let’s take a look what you can do with each Resource in the MediaGroup.

Player

Player can be used to play media streams from a URI (or a set of URIs). For example, the following code snippet shows how to play a song from a Web server.

mg.getPlayer().play(URI.create("http://myserver.com/mysong.wav"), RTC.NO_RTC, Parameters.NO_PARAMETER);

If the media server supports text-to-speech, you can use SSML to render text into synthesized speech. For example,

final static String HELLO_WORLD_SSML = URLEncoder.encode("application/ssml+xml, <?xml version=\"1.0\"?><speak><voice>Hello World!</voice></speak>", "UTF-8");
mg.getPlayer().play(URI.create("data:" + HELLO_WORLD_SSML), RTC.NO_RTC, Parameters.NO_PARAMETER);

You can ask the Player to play multiple items at once by giving an array of URIs.

To stop playing, simply call stop operation to stop either the current item being played or all the items in the queue.

You can also pause and resume the play, as well as adjusting speed and volume.

Recorder

Recorder can be used to record media to URI. For example, the following code snippet shows how to record your voice into a file on a Web server.

mg.getRecorder().record(URI.create("http://myserver.com/mysong.wav"), RTC.NO_RTC, Parameters.NO_PARAMETER);

To stop recording, simply call stop operation.

You can also pause and resume the recording.

SignalDetector

SignalDetector can be used to collect user input based on a set of patterns. For example, the following code snippet tries to receive one DTMF input from the user.

mg.getSignalDetector().receiveSignals(1, SignalDetector.NO_PATTERN, RTC.NO_RTC, Parameters.NO_PARAMETER);

The use of Parameter and pattern label in Java Media Control API is a bit akward. I will explain more in the future blog.

Similarly, you can call stop() to stop the receiveSignal operation.

SignalGenerator

SignalGenerator can be used to generate signal into the media server. This is typically used when the DTMF input are received via SIP INFO.

Sample Application

Here is a sample application that will announce “Hello World” after you dial into it. Then it will ask you to input “1″ for replay. Any other key input will determinate the call.

To run this sample, assuming you have Voxeo Prism downloaded and installed, simply drop the Tutorial.3.war to <prism>/apps directory. Once you have Voxeo Prism (both Application Server and Media Server) started, simply dial “sip:user@localhost” from your SIP phone such as SJPhone or XLite.

Source code is included in the sample .


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Java Media Control API: Media Object Composition

Wednesday, April 20th, 2011

To continue my Java Media Control blogs, before we can talk about media operations, we need first talk about media object composition concept. Media objects are composed together for media processing and functions. E.g. a NetworkConnection, which represents the client, connects to a MediaMixer to join a conference.

Join

A composable media object is called Joinable in Java Media Control API. When a joiner joins with a joinee, the media streams are connected between them. You can even specify the directions of the media streams when joining.

  • DUPLEX means the media streams can flow both ways.
  • RECV means the media streams can flow from joinee to joiner only.
  • SEND means the media streams can flow from joiner to joinee only.

Here is code sinppet that shows how to join a NetworkConnection and a MediaGroup.

      NetworkConnection nc = ....
      MediaGroup mg = ....
      nc.join(Joinable.Direction.DUPLEX, mg));

To decompose, you simply unjoin the joinee from the joiner.

Asynchronous Join

join won’t return until the composition is completed. If you don’t want to wait for the the completion, you can use the asynchronous joinInitiate method, which simply initiates the composition. A Joinable object that supports asynchronous join must be a JoinEventNotifiers so it can send the composition result back as a JoinEvent. This is very similar to media event model, as illustrated in the following digram.

Similarly, you can also use asynchronous unjoinInitiate method to start the decomposition and get notified by the event for the decomposition result.

Re-Join

A pair of joined Joinables can be re-joined, potentially change the existing media stream directions.

E.g. the following code snippet shows how to change a fully connected NetworkConnection to listen-only mode.

     nc.join(Joinable.Direction.DUPLEX, mg));
     ...
     nc.join(Joinable.Direction.RECV, mg));

Multiple Joins

A media object can be joined to more than one media object to form a graph of composition. However, one important rule here is any media object, except MediaMixer, can talk to many other media objects but only listen to one media object.

A simple use case of this is illustrated by the following diagram from Java Media Control API spec.

In this diagram, the Fred’s NetworkConnection NC2 joins to both Mark’s NetworkConnection NC1 and a MediaGroup. Please note, in this case, you can not use the Player of the MediaGroup since it violates the talk-to-many/listen-to-one rule as Fred can only listen to Mark right now.

Join Degradation

Please note in the above example, NC2 is explicitly joined with SEND direction with the MediaGroup. What happens if NC2 joins the MediaGroup with DUPLEX? I.e.

      NC1.join(DUPLEX, NC2);
      NC2.join(DUPLEX, MGASR);

This will cause so called Join Degradation — the join between NC1 and NC2 will be degraded from DUPLEX to RECV only. I.e. Fred can speak to Mark but not heard from him. Rather Fred will hear from whatever plays in the MediaGroup. The Join Degradation basically means the last joined DUPLEX pair wins and all the previous joined pairs will change according to the talk-to-many/listen-to-one rule.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Java Media Control API: SDP Negotiation

Friday, January 21st, 2011

Before the media server can talk to the client, both sides have to negotiate the media session description. In Java Media Control API, the media session negotiation is based on SDP offer/answer model.

NetworkConnection in Java Media Control API represents the client connecting to the media server. A NetworkConnection is created when a call is being setup. NetworkConnection contains a Resource called SdpPortManager, which facilitates the SDP negotiation on behalf of the media server.

SdpPortManager provides asynchronous APIs that can generate or process SDP on behalf of the media server. The application must register a MediaEventLister to listen for SdpPortManagerEvent, as discussed in the Event Model. The following table shows the methods and the corresponding returned event types.

Method Name Description Event Type
generateSdpOffer Ask media server to generate a new media session description offer for the client OFFER_GENERATED
processSdpOffer Ask media server to process a media session description offer from the client ANSWER_GENERATED
processSdpAnswer Ask media server to process a media session description answer from the client ANSWER_PROCESSED

Typically MsControlFactory is created when the application is initialized, as shown in Driver and Factory. And MediaSession is created from MsControlFactory for each call. In simple IVR applications, the MediaSession serves as the factory for NetworkConnection and other MediaObjects during the life cycle of the call. Make sure you release the MediaSession when the call is terminated.

Here is a sequence diagram that shows how the application uses SdpPortManager to do the SDP negotiation when receiving a SIP based call.

If the INVITE comes with session descriptions (i.e. a SDP offer), sequence A1 will be carried out. Otherwise, sequence A2 will be carried out by sending a SDP offer in the 200OK response.

When ACK is received with additional session descriptions (i.e. a SDP answer), sequence B is carried out first. Otherwise, the call is established and the application can start playing the media to the client.

Here is the code snippet of how the application handles INVITE and ACK, based on SIP Servlet.

  @Override
  protected void doInvite(SipServletRequest req) throws ServletException, IOException {
    if (req.isInitial()) {
      try {
        MediaSession ms = _msFactory.createMediaSession();
        NetworkConnection nc = ms.createNetworkConnection(NetworkConnection.BASIC);
        link(req, ms, nc) // associate SipSession, MediaSession, and NetworkConnection together
        SdpPortManager mgr = nc.getSdpPortManager();
        mgr.addListener(new SdpListener(req));
        final byte[] sdpOffer = req.getRawContent();
        if (sdpOffer == null) {
          mgr.generateSdpOffer();
        }
        else {
          mgr.processSdpOffer(sdpOffer);
        }
      }
      catch (MsControlException e) {
        throw new ServletException(e);
      }
    }
  }

  @Override
  protected void doAck(final SipServletRequest req) throws ServletException, IOException {
    final SipSession ss = req.getSession();
    final NetworkConnection nc = getNetworkConnection(ss); // retrieve NetworkConnection from the session
    final byte[] remoteSdp = req.getRawContent();
    try {
      if (remoteSdp != null) {
        nc.getSdpPortManager().processSdpAnswer(remoteSdp);
      }
      else {
        play(ss);
      }
    }
    catch(MsControlException e) {
      throw new ServletException(e);
    }
  }

Here is the code snippet of how the application handles SdpPortManagerEvent.

  class SdpListener implements MediaEventListener<SdpPortManagerEvent> {
    SipServletRequest _invite;
    SipSession _session;

    SdpListener(SipServletRequest invite) {
      _invite = invite;
      _session = invite.getSession();
    }

    @Override
    public void onEvent(SdpPortManagerEvent event) {
      try {
        EventType type = event.getEventType();
        if (event.isSuccessful()) {
          if (SdpPortManagerEvent.ANSWER_GENERATED.equals(type) ||
              SdpPortManagerEvent.OFFER_GENERATED.equals(type)) {
            final SipServletResponse resp = _invite.createResponse(SipServletResponse.SC_OK);
            resp.setContent(event.getMediaServerSdp(), "application/sdp");
            resp.send();
          }
          else if (SdpPortManagerEvent.ANSWER_PROCESSED.equals(type)){
            play(_session);
          }
        }
        else {
          if (SdpPortManagerEvent.ANSWER_GENERATED.equals(type) ||
              SdpPortManagerEvent.OFFER_GENERATED.equals(type)) {
            _invite.createResponse(SipServletResponse.SC_SERVER_INTERNAL_ERROR).send();
          }
          else if (SdpPortManagerEvent.ANSWER_PROCESSED.equals(type)){
            _session.createRequest("BYE").send();
          }
        }
      }
      catch (final IOException e) {
        _session.invalidate();
      }
    }
  }

In the next Java Media Control API blog, I will show you how to play some music for the client once the SDP is successfully negotiated.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Java Media Control API: Event Model

Sunday, January 16th, 2011

Most of Java Media Control APIs are asynchronous. The Java Media Control applications are typically event driven. So it is important to understand Java Media Control API’s event model.

There are three entities in the event model, as illustrated in the diagram. MediaEvents are sent by MediaEventNotifier to registered MediaEventListeners. A MediaEventNotifier can have multiple MediaEventListeners.

The following table lists the types MediaEventNotifier in Java Media Control API and the type of MediaEvent each one generates. Italic name indicates the MediaEventNotifier is a MediaObject. Bold name indicates the MediaEventNotifier is a Resource.

Types of MediaEventNotifier Types of MediaEvent
MediaMixer MixerEvent
Player PlayerEvent
Recorder RecorderEvent
SdpPortManager SdpPortManagerEvent
SignalDetector SignalDetectorEvent
SignalGenerator SignalGeneratorEvent
VideoRenderer VideoRendererEvet
VxmlDialog VxmlDialogEvent

In Java Media Control API, the event model is strong type. MediaEventListener is a Java Generic interface. Typically the implementation narrows down to a particular event type.

A successful event indicates the operation that triggers the event succeeded. Otherwise, MediaErr can be retrieved from the event. Each event has a EventType to indicate the nature of the event.

The events from a Resource are ResourceEvents. Each ResourceEvent has a Qualifier to provide additional context about the event. ResourceEvent also has a Trigger to indicate how this event is triggered.

In  the future blogs, I will talk more about each type of MediaEventNotifier, its operations, and corresponding EventType, Qualifier, and Trigger.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Java Media Control API: Driver and Factory

Friday, December 10th, 2010

In the last blog, I gave an overview of the core objects in Java Media Control API, a.k.a. JSR 309. But how do you create these media objects in the first place?

Here is the object ownership in Java Media Control API.

The root object – MsControlFactory – can be obtained from a Java Media Control Driver.

Java Media Control API uses a driver model similar to the JDBC driver model. Different implementations (for different media servers) register themselves to the DriverManager as Drivers. Or the implementations can be packaged as Service Provider jar and loaded by the DriverManager automatically.

DriverManager provides several static functions to get Drivers or MsControlFactory directly.

E.g. to get MsControlFactory for a Voxeo Prism’s driver, the following code can be used.

Properties props = new Properties();
props.put(MsControlFactory.MEDIA_SERVER_URI, "mrcp://127.0.0.1");
MsControlFactory factory = DriverManager.getFactory("com.voxeo.Driver_1.0", props);

Or you can simply use dependency injection to let the container to create MsControlFactory automatically.

@Resource MsControlFactory factory;

Once you have a MsControlFactory instance, you can create one or more MediaSessions, which is the container and factory for all the MediaObjects, such as NetworkConnection, MediaGroup, and MediaMixer.

Typically the first thing an application will do is to setup a NetworkConnection for each client that is connecting to the media server. I will go over that in detail in the next blog.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Java Media Control API: Overview

Saturday, December 4th, 2010

Java Media Control API, a.k.a. JSR 309, allows Java applications to control the media operations on a media server, e.g. playing a song or setting up a multiparty conference.

source: JSR 309 specification overview document

Java Media Control API is designed to be independent of the signaling protocol, as shown in the above diagram. You can use SIP or other protocols (e.g. Jingle) to manage the communication session. But Java Media Control API assumes the signaling protocol uses SDP offer and answer model to setup the media streams.

Typically an application sits on a platform with both signaling and media control capability, like Voxeo Prism, as shown on the left.

Java Media Control API is based on event driven model.  Many of operations are asynchronous and require the application to set up a MediaEventListener to listen for the events as the results of the operations. An application using Java Media Control API typically uses a Finite State Machine (FSM) model to respond to the events.

Operations and events are typically happened in different threads. Java Media Control API does not impose any specific threading model. That means it is implementation dependent. Voxeo Prism provides a thread-safe implementation.

In Java Media Control API, MediaGroup, MediaMixer, and NetworkConnection are three core MediaObjects you want to get familiar with. These three objects are also Joinables.

Joinable is the key interface for connecting media objects together to process media streams. A Joinable can join with another Joinable on specific  Directions: send only, receive only, or both, as shown in the following diagram.

Before doing any media operations, you need set up a NetworkConnection for each client that connects to the media server. Setting up the NetworkConnection requires using its SdpPortManager resource to negotiate the SDP between the client and the media server. Once set up, NetworkConnection can be joined with other Joinables to perform media operations for the client.

A simple use case for NetworkConnection will be two NetworkConnections joining together to bridge two clients. A NetworkConnection can also join with MediaGroup which includes Player, Recorder, and SignalDetector resources. These resources provide typical multimedia functions, such as playing media or text, recording media, and detecting DTMF or speech input.

MediaMixer can mix multiple audio streams together. It is a key component to provide multiparty conferencing. A simple case to set up a conference is to join multiple NetworkConnection to a MediaMixer. In practice, a conference typically plays certain announcement and asks for inputs such as meeting id. This will require a MediaGroup to join to the MediaMixer as well.

Another MediaObject is VxmlDialog, which allows the application to start and terminate a VoiceXML based IVR session.

In this blog, we reviewed the core media objects in Java Media Control API. In the next series of blogs, I will talk more about each objects and some of the advanced join concepts.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Apple joins Oracle in the OpenJDK project to ensure open source Java for Mac OSX

Friday, November 12th, 2010

OpenJDK.jpgGiven that we are a Mac-based company and that we release multiple Java-based products on Windows, Linux and MacOS X, I was pleased to see this news release from Apple and Oracle that Apple will be joining the OpenJDK project to ensure that an open source Java implementation will be available for MacOS X. Henrik Ståhl’s Java blog has a bit more information (emphasize “a bit”) and TechCrunch also covered the story this morning.

As noted in Henrik Ståhl’s Java blog, it’s not clear when an OpenJDK implementation for Mac OSX will be out… but it is at least now in the queue. Good news given Apple’s recent deprecation of their own version of the JDK.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.


Speaking of JSRs

Wednesday, October 8th, 2008

Micromethod acquisition brings the Voxeo developer community a new set of programming interfaces – Java based APIs for call controls and potentially media controls.

These APIs are based on specifications defined by Java Community Process (JCP). JCP is a process for the Java community to develop specificiations for different Java technologies, such as APIs, languages, virtual machines, etc.

A specification is typically started with a Java Specification Request (JSR) by one or more JCP members. Once accepted by JCP Executive Committee, the JSR is assigned with a number, such as JSR 289. The JSR will be developed within the community in the following phases.

JSR timeline

Here is a list of all JSRs that have been developed so far.

SIPMethod Application Server is a JSR-116 (SIP Servlet 1.0) and JSR-289 (SIP Servlet 1.1) based SIP application server. I will talk more about how to develop SIP Servlet based applications in the future blogs.


Want to learn how Voxeo can help unlock your communications and deliver a better customer experience? Please contact us!

If you found this post interesting or helpful, please consider either subscribing via RSS, becoming a fan on Facebook, or following us on Twitter.