Posts Tagged ‘ssml’

Fallback to text to speech if your audio fails

Tuesday, June 28th, 2011

Here’s a quick tip that’s an answer to a common question: What happens if my audio file cannot be played?

If you provide the URL to an audio file in Tropo’s say function Tropo will fetch it and play it. But what if you have a mistake in your URL? Or your server is down? You can use the W3C standard SSML to provide a text to speech (TTS) fallback and prevent your callers from getting dead air.

Say you have an application that starts out like this:

<?php
say('http://example.com/welcome.wav');
?>

Tropo will play welcome.wav if it can reach the file.

Now I’ll use a little SSML to provide a TTS string to play if your file isn’t available:

<?php
say('<speak><audio src="http://example.com/welcome.wav">This text will be spoken if the audio file can not be played.</audio></speak>');
?>

Just another example of how Tropo makes the easy stuff easy while still making it possible to do most anything.

Building a full-featured conference call application with Tropo and PHP

Friday, June 24th, 2011

Tropo’s conference function makes it easy for you to join multiple callers onto a single call together. One line of code and you have a conference call…

<?php conference('someid'); ?>

That’s it. A basic conference line. Use that as your application and everyone calling into your number will be joined into a single conference call. How many users? By default, Tropo allows up to 100 users into a single conference. We can do more if you need it, but you might want to think through your use case a bit first. 100 people all talking on one line gets crazy fast.

Now let’s take this basic concept and turn it into a full-fledged application. Our requirements for the conference call start with allowing the entry of a conference ID so that this number can be used for more than one call.

<?php
$response = ask('Enter your conference ID', array(
    'choices' => '[4 DIGITS]'
));
conference($response->value);
?>

We’d also like to allow conference IDs of multiple lengths (instead of just 4 digits), and to let the caller tell us when they’re done with entry by pressing #. How about conference IDs of 3-10 digits?

<?php
$response = ask('Enter your conference ID. Press the pound key when finished.', array(
    'choices' => '[3-10 DIGITS]',
    'terminator' => '#'
));
conference($response->value);
?>

In noisy rooms offices, background noise sometimes get misinterpreted as input. Since we’re only accepting numeric conference IDs, let’s turn off speech recognition and restrict input to the keypad

<?php
$response = ask('Enter your conference ID. Press the pound key when finished.', array(
    'choices' => '[3-10 DIGITS]',
    'terminator' => '#',
    'mode' => 'keypad'
));
conference($response->value);
?>

Let’s tell them if their conference ID was accepted and notify them they’re entering the room.

<?php
$response = ask('Enter your conference ID. Press the pound key when finished.', array(
    'choices' => '[3-10 DIGITS]',
    'terminator' => '#',
    'mode' => 'keypad'
));
say('Conference ID ' . $response->value . ' accepted. You will now be placed into the conference. Please announce yourself.');
conference($response->value);
?>

Hmm. That conference id is being read as a number (one thousand two hundred thirty four) instead of as digits (one two three four). A little ssml magic will fix that.

<?php
$response = ask('Enter your conference ID. Press the pound key when finished.', array(
    'choices' => '[3-10 DIGITS]',
    'terminator' => '#',
    'mode' => 'keypad'
));
say('<speak>Conference ID <say-as interpret-as="vxml:digits">' . $response->value . '</say-as> accepted. You will now be placed into the conference. Please announce yourself.</speak>');
conference($response->value);
?>

Often I hit the wrong conference id and need to hang up and try again. For our conference line, how about we let someone exit the room and pick another one.

We’ll add a loop that loops as long as the call is active. Once someone exits, the loop starts over and they’ll be prompted for their conference id again. We’ll also add a terminator to the conference so someone can exit and let them know that option is available.

<?php
answer();
while ($currentCall->isActive()) {
	$response = ask('Enter your conference ID. Press the pound key when finished.', array(
	    'choices' => '[3-10 DIGITS]',
	    'terminator' => '#',
	    'mode' => 'keypad'
	));
	say('<speak>Conference ID <say-as interpret-as="vxml:digits">' . $response->value . '</say-as> accepted. You will now be placed into the conference. Press pound to exit without disconnecting. Please announce yourself.</speak>');
	conference($response->value, array('terminator' => '#'));
}
?>

What if someone doesn’t enter a conference id? Or enters too few or too many numbers? Let’s check the return of the ask function to see if the ask succeeded, failed, or timed out. If it’s a timeout or failure, play a message and the loop starts again, asking for the conference id again.

<?php
answer();
while ($currentCall->isActive()) {
	$response = ask('Enter your conference ID. Press the pound key when finished.', array(
	    'choices' => '[3-10 DIGITS]',
	    'terminator' => '#',
	    'mode' => 'keypad'
	));
	switch($response->name) {
    case 'choice':
      say('<speak>Conference ID <say-as interpret-as="vxml:digits">' . $response->value . '</say-as> accepted. You will now be placed into the conference. Press pound to exit without disconnecting. Please announce yourself.</speak>');
      conference($response->value, array('terminator' => '#'));
      // Pause a moment before asking for another conference.
      sleep(1);
      break;
    case 'badChoice':
      say('Sorry, that is not a valid conference ID.', array('voice' => $voice));
      break;
    case 'silenceTimeout':
    case 'timeout':
      say('Sorry, I didn\'t hear anything.', array('voice' => $voice));
      break;
  }
}
?>

Now that we are handling a timeout, the default timeout seems way too long for this type of application. If the user doesn’t enter a conference ID in 8 seconds, prompt again.

<?php
answer();
while ($currentCall->isActive()) {
	$response = ask('Enter your conference ID. Press the pound key when finished.', array(
	    'choices' => '[3-10 DIGITS]',
	    'terminator' => '#',
	    'mode' => 'keypad',
	    'timeout' => 8
	));
	switch($response->name) {
    case 'choice':
      say('<speak>Conference ID <say-as interpret-as="vxml:digits">' . $response->value . '</say-as> accepted. You will now be placed into the conference. Press pound to exit without disconnecting. Please announce yourself.</speak>');
      conference($response->value, array('terminator' => '#'));
      // Pause a moment before asking for another conference.
      sleep(1);
      break;
    case 'badChoice':
      say('Sorry, that is not a valid conference ID.', array('voice' => $voice));
      break;
    case 'silenceTimeout':
    case 'timeout':
      say('Sorry, I didn\'t hear anything.', array('voice' => $voice));
      break;
  }
}
?>

What if we want to restrict the conference IDs that someone can use? Maybe create a conference application that uses reservations and creates rooms for people. We could fetch a list of currently-valid conference IDs from a database. In this sample, I’m just going to hard-code those in an array.

<?php
$pins = array();
$pins['1337'] = '';
$pins['1234'] = '';
$pins['2600'] = '';
answer();
while ($currentCall->isActive()) {
	$response = ask('Enter your conference ID. Press the pound key when finished.', array(
	    'choices' => '[3-10 DIGITS]',
	    'terminator' => '#',
	    'mode' => 'keypad',
	    'timeout' => 8
	));
	switch($response->name) {
    case 'choice':
	    if (!array_key_exists($response->value, $pins)) {
	      say('Sorry, that is not a valid conference ID.');
	      break;
	    }
      say('<speak>Conference ID <say-as interpret-as="vxml:digits">' . $response->value . '</say-as> accepted. You will now be placed into the conference. Press pound to exit without disconnecting. Please announce yourself.</speak>');
      conference($response->value, array('terminator' => '#'));
      // Pause a moment before asking for another conference.
      sleep(1);
      break;
    case 'badChoice':
      say('Sorry, that is not a valid conference ID.', array('voice' => $voice));
      break;
    case 'silenceTimeout':
    case 'timeout':
      say('Sorry, I didn\'t hear anything.', array('voice' => $voice));
      break;
  }
}
?>

Now for a fun feature. Let’s let conference ID owners — the room administrator — get a text message any time someone joins their conference room. Never miss a conference call again. And for kicks, send a text message when they leave, too.

<?php
// An array of conference IDs and phone numbers to alert.
// If a conference ID is used that has a phone number attached,
// when someone joins or leaves that conference, the attached phone
// number will get an SMS alerting them.
$pins = array();
$pins['1337'] = '14075551212';
$pins['1234'] = '19255556789';
$pins['2600'] = '';
answer();

while ($currentCall->isActive()) {
  $response = ask('Enter your conference ID. Press the pound key when finished.', array(
      'choices' => '[3-10 DIGITS]',
      'terminator' => '#',
      'mode' => 'keypad',
      'timeout' => 8
  ));
  switch($response->name) {
    case 'choice':
      if (!array_key_exists($response->value, $pins)) {
        say('Sorry, that is not a valid conference ID.');
        break;
      }
      if (array_key_exists($response->value, $pins) && !empty($pins[$response->value])) {
        // Send an alert that someone has entered the conference
        message($currentCall->callerID . ' has entered conference ' . $response->value, array('to' => $pins[$response->value], 'network' => 'SMS'));
      }
      say('<speak>Conference ID <say-as interpret-as="vxml:digits">' . $response->value . '</say-as> accepted. You will now be placed into the conference. Press pound to exit without disconnecting. Please announce yourself.</speak>');
      conference($response->value, array('terminator' => '#'));
      if (array_key_exists($response->value, $pins) && !empty($pins[$response->value])) {
        // Send an alert that someone has left the conference
        message($currentCall->callerID . ' has left conference ' . $response->value, array('to' => $pins[$response->value], 'network' => 'SMS'));
      }
      // Pause a moment before asking for another conference.
      sleep(1);
      break;
    case 'badChoice':
      say('Sorry, that is not a valid conference ID.', array('voice' => $voice));
      break;
    case 'silenceTimeout':
    case 'timeout':
      say('Sorry, I didn\'t hear anything.', array('voice' => $voice));
      break;
  }
}
?>

And there’s a full-featured conference calling application in about 50 lines of code. We started with a basic one-liner and slowly added features to build a conference calling service that rivals the professional conference applications. The full code is available from the Tropo Samples repository on Github. The download on Github has some additional configuration, like allowing you to set a voice for all the prompts and turn on and off the feature that restricts this to known conference IDs.

How robust is the code? This exact application is what we use every day as our conference calling line. Calls with customers, sales prospects, and each other all happen over a conference line backed by the code in this tutorial.

Teaching Your Application to Really Talk

Friday, March 26th, 2010

Speech Synthesis, otherwise known as Text to Speech (TTS), is a technology that quickly synthesizes a human voice using text as input. Speech synthesis  is the default behavior for voice calls on the Tropo platform. The Tropo ‘say‘ verb is the one that provides the TTS capability, by taking a string of text and speaking it back. It is of course possible for this verb to take a URL to a ‘wav’ or ‘mp3′ file for pre-recorded audio to be played as well.

When it comes to teaching your application to speak we follow the Perl ethos of making “the simple things easy and difficult things possible”. So your application may speak very well with the simplicity of our APIs, or it may be as sophisticated and emotional as you like through Tropo exposing powerful capabilities for giving your voices character.

For our first example we will simply say:

say 'I like squirrels!'

Which then renders this audio.

Next, we may choose from a voice that speaks any number of languages supported by Tropo (US/UK English, Castilian/Mexican Spanish, French, German, Italian & Dutch). Lets give French a try for our next example:

say "J'aime les écureuils!", :voice => 'florence'

Which then renders this audio.

Now, those were the simple examples that anyone may use to add a little speech to their applications. But, remember, we also make the difficult possible for those who want to really make their characters speak. As sometimes simply customizing the voice is not enough. There are cases when you’d also like control over pitch, volume and intonation. Tropo natively supports a standard called the Synthesized Speech Markup Language (SSML).

The Speech Synthesis Markup Language (SSML) is a W3C standard for controlling the pace, tone, pitch and all around sound of computer generated voices. Here’s a Ruby script that repeats the same sentence four times; each at a gradually lower speed:

answer 
say "<speak> I like squirrels!. 
I <prosody rate='-10%'>like squirrels!</prosody> 
I <prosody rate='-30%'>like squirrels!</prosody>  
I <prosody rate='-50%'>like squirrels!</prosody> 
</speak>"  
hangup

Which renders this audio. The previous example made use of the rate property of the SSML prosody element to control the playback speed. There are many other elements and attributes you may use, including: emphasis, phoneme, etc. To learn more about SSML and related technologies check out the W3C site at http://www.w3.org/TR/speech-synthesis/.

If you would like to call in and listen to these examples live, you may do so by dialing +990009369991429940 on Skype (free) or calling +1.408.940.5920 from any phone. What are you waiting for? Get started by signing up for an always free developer account @ Tropo.com.