Talking to the Cloud, and the Cloud Talking Back

August 28th, 2009 by Jason Goecke

One of the great things about Tropo is that it has a speech recognition and text to speech engine built right in. This allows a user to speak commands to your voice application and respond to them with dynamically generated content. We make every effort to make these features robust and yet simple to use for developers.

In the first case, we will ask the user to provide their zipcode and then play it back to them:

  answer
  options = { 'choices'     => '[5 DIGITS]',
              'repeat'      => 3,
              'onBadChoice' => lambda { say 'That is not a zip code, please try again.'} }
  choice = ask 'Please enter your zip code.', options

  # Add spaces to speak back individual digits, rather than one number
  zipcode = String.new
  choice.value.split(//).each { |char| zipcode << char + ' ' }
  say "Your zip code is #{zipcode}. Goodbye."

The key to this is the option ‘choices’, which is where we may pass our simple grammar to prompt the user. In this case we are asking the speech recognition engine to ask for up to [5 DIGITS] and the user may then either use their phone’s touch tone keypad or speak their response. We then take that response, which comes back as a string, and add spaces in between the numbers so that it is spoken back as you would a zipcode as opposed to a single number.

Now that is for digits, of course one may always use their telephone to enter digits you say. Now lets look at asking our customer questions:

  
answer
  options = { 'choices'     => 'cheese, pepperoni, vegetarian',
              'repeat'      => 3,
              'onChoice'    => lambda { |choice| say "We will send you a #{choice.value}
                                                      pizza. Goodbye." },

              'onBadChoice' => lambda { say 'We do not have that kind of pizza,
                                             please try again.'} }
  choice = ask 'Which pizza would you like to order?', options

In this case we are passing the ‘choices’ option a string that provides multiple spoken choices that the user may speak to have a valid response. We are then playing that response back to the user when we recognize it as the value is populated in ‘choices.value’.

That was for simple multiple choice, what if more than one phrase may qualify for a single response?:

  
answer
  options = { 'choices'     => 'denver broncos(broncos, denver, denver broncos),
                                dallas cowboys(cowboys, dallas, dallas cowboys)',
              'repeat'      => 3,
              'onChoice'    => lambda { |choice| say "A so you like the #{choice.value}
                                                      do you?. Goodbye." },
              'onBadChoice' => lambda { say 'We do not have that team, please try again.'} }
  choice = ask 'Who is your favorite football team?', options

First off, I am not making any statements about NFL teams here, just shortening the choices for the purposes of brevity.  In this case we are passing the ‘choices’ option a string that contains the responses we expect (ie – denver broncos) with a series of possible spoken phrases inside the parenthesis that could qualify. When one of those phrases is recognized, the qualifying value gets populated in the ‘choices.value’.

So what are you waiting for? Start talking with your users. There are many more examples in multiple languages may be found here.

Related posts:

  1. Talking to the Cloud: Build Speech Recognition Applications with Tropo
  2. Tropo Turns Two! Looking Back On 2 Years of Transforming Communications
  3. Star Wars Empire Strikes Back hotline on Tropo
  4. Cloud Awesomeness with Tropo and CouchDB
  5. Speech-Driven Phone Applications in the Cloud

Tags: , , , , ,

2 Responses to “Talking to the Cloud, and the Cloud Talking Back”

  1. [...] The Tropo Blog All your programming language are belong to us! « Talking to the Cloud, and the Cloud Talking Back [...]

  2. [...] the previous weeks, Jason Goecke made a post regarding how to use Tropo’s Simple Grammar Engine to do some trivial voice recognition in your applications. In today’s blog, I will be showing [...]

Leave a Reply

Please note: By submitting a comment you agree to comply with our Comment Policy. We welcome all comments, positive or negative, but do reserve the right to remove all or part of blog comments that do not comply with our policy.

Additionally, the first time you leave a comment on this blog, it will be held for moderation. After that first comment has been approved, future comments will be posted without delay.

Additional comments powered by BackType