Archive for the ‘Uncategorized’ Category

Ruby on Rails Example for Tropo Web API without port forwarding!

Friday, February 12th, 2010

Zhao Lu (aka @zlu) has shared a tutorial he has done using Ruby on Rails and our recently released Tropo Web API. The tutorial shows how to add, or build, the Tropo features to your Rails application in 15 minutes using our REST/JSON API. All of this deployed to Heroku for easy application deployment.

Another great thing that Zhao goes on to show, is the ability to use the Tropo Web API behind a firewall where you can not open or forward ports. For this he shows using Tunnlr to establish a remote SSH tunnel. With this tunnel you obtain a port on Tunnlr’s public IP address that forwards via your SSH tunnel to your application inside your firewall. This is great when you are on a company or university network where opening and forwarding a port through a firewall is not an option.

For more details have a look at the README. The full source code example is available on Github @ http://github.com/zlu/tropo-tutorial. A big thanks to Zhao for showing how easy it is to use the Tropo Web API!

Fetch the Initial Text of Any SMS or Instant Message on Tropo

Friday, January 29th, 2010

Another of the multi-modal features we added to Tropo, is the ability to capture the first text a user send via SMS or Instant Message. Previously, when a user would send a message to an SMS/IM application you had written, you had no way to capture that first message. Instead you would have to send a prompt/ask back to the user to start receiving input, having lost that first message.

With the new release, this is no longer the case. With the hosted API, the initial text message is stored in the call object available at the start of your script. You may obtain the initial text string as follows (depending on the language):

$currentCall.initialText

# or

currentCall.initialText

If you are using the new Tropo Web API, then you may find the ‘initialText’ value in the session JSON string you first receive with each new session, as follows:

{ "session": { "initialText": "Hello, I would like to ask you about your app." } }

Enjoy.

Free-form Text Capture via SMS, Capturing an Address

Thursday, January 28th, 2010

A feature we recently added to Tropo is the ability to capture free-form text from a user when using one of the messaging channels, such as SMS or Instant Messaging. Previously it was required that you prompt the user for input based on a grammar:

answer
ask 'Please enter your ten digit account number.', { :choices => '[10 DIGITS]', :repeat      => 3 }
hangup

This restricted the ability to capture details that are better served by free-form text, such as an address, without having to build complex grammars which are needed when doing voice recognition. While you may still use the power of grammars for voice and text, you may now obtain free-form text by passing ‘[ANY]‘ to the choices option as follows:

answer
result = ask 'Please enter your street address.',  { :choices => '[ANY]', :repeat      => 3 }
log result.value
hangup

Sending an SMS or Instant Message to the above script will result in this line in your log based on the user’s input (in this case ’1234 Brewster St’):

00122   	00-1   	11:40:04 PM   	Call[14157044517->13314659919] : 1234 Brewer St

This scenario will work with both the hosted API and the Tropo Web API. Enjoy!

New Tropo Web API, Conferencing and more

Wednesday, January 20th, 2010

Tropo exists to make it easier for a developer to quickly and easily build voice, SMS, and IM applications using the languages and tools you already know. And we’re now making it even easier.

Starting today, in addition to the hosted Tropo we already offer, you can host your applications on any web server and communicate with the  new Tropo Web API over web services. Build your app in any programming language you want, using any tools you want, using any libraries and connecting to your existing databases and business logic.

Here’s how it works.

We give you a phone number, and you give us a URL somewhere on the web where your app is located. When that phone number is called, our servers answer the call and make an HTTP POST to your URL, sending you a wealth of JSON metadata about the session. Caller ID, a session ID, that sort of thing. You return some JSON telling us what to do with the call. You’ve got complete control of the call. Say something with text to speech, play an audio file, start recording, ask a question and get the answer with voice recognition, even put the caller into a conference with other callers.

Conference? Yeah, that’s new, too. Build a conference call app in just a few lines of code. Create as many rooms as you want, put users into rooms, take them out of rooms. Mute them. Again, you’re in full control.

How easy is it to create a conference call? Here’s a conference call app:

{ "tropo":
 [{
 "conference": {"id": "myconferencecall" }
 }]
}

Put this code as conference.json on your server and point a Tropo app at it. Have all your friends call it and they can chat together. You can have more control if you want, of course. Ask for pin numbers. Put them in rooms based on their caller ID. Mute callers so they can only listen and hold a seminar. Whatever you can dream up.

Don’t like JSON? We’ve got a Tropo Ruby library (available on Gemcutter) that will create it all for you. Write your app using Ruby that looks and feels like a Ruby app should. Stick it on your server. We’ll do all the hard stuff. Libraries for other languages are coming. And if you write one for your favorite language, let us know and we’ll link it up.

It’s not just about voice.

With Tropo, the exact same code that makes and receives voice calls can send and receive instant messages and SMS. We’ve improved our IM and text message support. The conversation flow is even more natural now. You’ll get the first message just like any other, without having to jump through hoops to start a conversation. Want to know who you’re talking to? We provide you with the IM name or SMS number that contacted you and you can easily distinguish between voice sessions, text sessions and even know if the user contacted you via AIM, GTalk, Yahoo or other IM networks.

With today’s release, we’re continuing Tropo’s mission of making it easy for any developer to add real-time communications to any application. You can get the full documentation on building apps from Tropo Docs. Go put an app up and try it out. We’d love to hear from you as you create your apps and find out what great stuff you’re creating. Leave a comment here, email support@tropo.com or catch us on Twitter as @tropo.

AwayFind profiled

Wednesday, January 6th, 2010

Tropo customer AwayFind has a great writeup on GigaOm’s Web Worker Daily. The WWD post was also picked up by the New York Times, Salon, and others.

AwayFind allows you to still get critical email while freeing you from checking your email a billion times each day by automatically categorizing and filtering your mail then using other communications channels to notify you of what’s important.

Depending on whether an email falls into the specific categories you choose, you can have a message sent to you by text message, IM or Twitter DM. You can even automatically direct messages to someone else if need be — if, for instance, a client needs help, you can have the message forwarded to technical support without needing to even see the email at all. And if you’re not near your computer, you can even arrange for AwayFind Orchant to call you with a particularly urgent message and read it off to you.

Nice work by Jared and the rest of AwayFind.

Advanced grammar topics for Tropo

Thursday, December 17th, 2009

This is a guest post from Dominique Boucher, Product Manager at Nu Echo.


Recently, Tropo has added support for SRGS grammars and JSGF grammars. Mike Thompson and Jason Goecke wrote about that a few weeks ago. In this post, I will go one step further and show some more advanced tools in the voice recognition ecosystem.

Authoring grammars

Once you’ve decided to add SRGS grammars to your Tropo application, one question arises: which format will you use to author them? XML or ABNF?

XML has the advantage of being supported by most major recognition engines on the market. It is often the native format for the engine. On the other hand, ABNF is much more compact and readable than the XML equivalent. For example, here is the same yes/no grammar expressed in both ABNF:

yesno-abnf

and XML (click to enlarge):

yesno-xml

I don’t know for you, but I much prefer the former.

At SpeechTEK’09, in August, Voxeo announced a partnership with Nu Echo to bundle NuGram IDE Basic Edition with the VoiceObjects Developer Edition. NuGram IDE is a complete environment for developing, debugging and testing voice recognition grammars in the ABNF syntax. (The basic edition, which is free of charge, can also be installed separately, directly from Eclipse.) With NuGram IDE, you write and test ABNF grammars on your desktop, without requiring a voice recognition engine. Once you are satisfied with your grammars, you integrate them in your application. And if you prefer to use their XML counterpart, just let NuGram IDE do the grunt conversion work for you.

Dynamic grammars

Most grammars used by voice applications are static grammars. By this, I mean that they do not change over time, nor do they depend on contextual, call-specific data.

Sometimes, however, the application needs to generate grammars on-the-fly. We call them dynamic grammars. Consider a voice-dialing application. The application identifies the caller, looks in the caller’s contact list, and asks for the name of one of his contacts. The grammar to use in the last step depends on the content of the caller’s profile.

But how are dynamic grammars served to the application? Well, usually a dedicated web application will be responsible for that. This can be a JSP or ASP page, a Ruby on Rails app, etc. In all these cases, a web application must be developed and made accessible to the Tropo runtime.

Another solution is NuGram Hosted Server (or NHS). It’s a free hosted platform specifically designed to serve dynamic grammars to cloud-based communication applications. So it nicely complements Tropo. All you have to do is create your dynamic grammar templates and push them to your NHS account once you have registered, all from within NuGram IDE (publishing grammars is done using a single keystroke — Alt-Ctrl-Shift P — from the IDE).

The dynamic grammar templates are expressed using a few extensions to the ABNF syntax. For example, the template for the voice-dialing grammar would look like (click to enlarge):

voicedialer-abnf

A tutorial describing the various templating directives is available on the NuGram website.

The client API

Generating a dynamic grammar from a template (instantiating a grammar) involves sending some data (the instantiation context) to NHS using an HTTP-based API. Fortunately, higher-level client APIs are available on Github in a variety of programming languages. (All you have to do is include the code of the API at the top of your Tropo application.)

To illustrate, here is a prototype voice dialing application that instantiates the grammar template above and uses the URL of the generated grammar formatted in XML form (click to enlarge):

voicedialer-rb

Lines 10-11 simply create a new session with NuGram Hosted Server. Line 19 retrieves the contacts for the current caller. Finally, line 26 instantiates the grammar with the contacts template and retrieves the URL of the generated grammar in XML form.

The get_contacts function simply returns a list of hashes of the form {'firstname' => "first name", 'lastname' => "last name", 'extension' => "phone number"}, one for each of the caller’s contacts. In our demo, the function makes a request to a web application that formats the contacts as a JSON string, and converts the data to a plain Ruby data structure (click to enlarge):

getcontacts-rb

Of course, in a real application, the data could be fetched from a Web service, a database, etc.

That’s it! The code of the whole application is also available on Github. The Ruby application called by the get_contacts function is a web application based on Sinatra and is readily deployable on Heroku.

If you have any question or comment, please leave a comment.

Powerful Speech-Driven Tropo Applications

Monday, September 28th, 2009

In the previous weeks, Jason Goecke made a post regarding how to use Tropo’s Simple Grammar Engine to do some trivial voice recognition in your applications. In today’s blog, I will be showing you how to take that a step further, and implement some industry standard grammars and interpretation mechanisms. These grammar types will allow Tropo to utilize the same advanced level of speech recognition you might use or expect in VoiceXML applications today.

Before we get started with the examples, here is a list of the types of grammars (and return styles) which will be available to you:

SRGS (Also referred to as grXML)
SISR (Semantic Interpretation for Speech Recognition)
GSL
ABNF

GSL syntax is not considered to be a W3C-compliant syntax for grammars, and Nuance has discontinued support for GSL grammars in their most recent product offerings. Tropo will continue to support GSL-specific markup for some time to come, but it is strongly suggested that new applications and their associated grammars leverage the SRGS + SISR grammar syntaxes instead of being reliant upon the deprecated GSL grammar format.

The above being said, the example I will be showing you in this post will be Tropo utilizing an SRGS grammar with SISR returns. This is 100% W3C compliant, and is the industry standard for grammar development. Let’s start with our grammar:

grammar

Those of you familiar with grammars will likely notice this structure. If not, a great place to get started is here. The above grammar accepts the following utterances:

Red Sox, Boston Red Sox, Yankees, or New York Yankees

Based on the team you choose, you will get some information back about the team. Specifically, the value you would like returned for the team, the league they play in, their division, and standing. The grammar is quite simple, and I made it this way to illustrate the concept of using external grammars with your Tropo applications. Feel free to go as crazy as you want with these grammars.

How does one tie this grammar into a Tropo application? It’s easy! Let’s take a look at a basic Ruby app:

ruby-app

Notice when we declare our choices within “options”, I simply reference the remote destination of my SRGS/GRXML grammar with SISR returns. As soon as the prompt starts, we should be able to say any of the above utterances. When the result comes back, you can get the slot values (team,division,standing,etc) by accessing them directly:

result.choice.tag.get(“team”)
result.choice.tag.get(“division”)
result.choice.tag.get(“standing”)
result.choice.tag.get(“league”)

That’s it! At this point, you should have the information needed to start developing your own Tropo applications with powerful voice recognition capability. If you have any questions at all, feel free to contact our free 24×7 Support team! We are more than happy to help you with any issues you may encounter!

Latest Tropo Upgrade Completed

Tuesday, September 8th, 2009

We are continuing to evolve Tropo, by releasing a new upgrade to the Tropo cloud. This upgrade includes the following:

  • Support for fetching Java Speech Grammar Format (JSGF) and Speech Recognition Grammar Specification (SRGS) files from an external HTTP or FTP server, in addition to the built in support for Simple Grammar. We are working on a couple of follow up posts for how-tos on using these enhanced speech grammar capabilities.
  • When placing an outbound call, you must now include the ‘+’ and country code. To dial in the US would then need to be ‘+1415551212′ for every outbound call.
  • The ‘#’ symbol on the telephone keypad may now be used to terminate a recording in a Tropo application.
  • Addition of MP3 as an audio file playback format.
  • You may now play touch-tones (DTMF) after a call has connected. You may now issue a call with these additional parameters: “+14155551212;postd=1234;pause=22000ms”. Where ‘postd’ is the digits to be dialed and ‘pause’ is the amount of time to wait after connecting the call to issue the digits.
  • New accounts will now need to request outbound dialing access from support@voxeo.com. All existing accounts have outbound enabled and will continue to do so.

We continue to work on many new features and will roll them out as they become available. Enjoy the new features!

Update to the Tropo FAQ Reveals More of our Java Underpinnings

Monday, August 3rd, 2009

We know it is important to provide developers as much information as possible in order to allow them to maximize their use of Tropo. To this end we have updated the Tropo FAQ, revealing more about the Java underpinnings of Tropo. If you have any questions, do not hesitate to come by and chat with us on the IRC (#tropo), Public Skype Chat or the Tropo Forums.