Tropo is part of CiscoLearn More

New Tropo Web API, Conferencing and more

Posted on January 20, 2010 by Adam Kalsey

Tropo exists to make it easier for a developer to quickly and easily build voice, SMS, and IM applications using the languages and tools you already know. And we’re now making it even easier.

Starting today, in addition to the hosted Tropo we already offer, you can host your applications on any web server and communicate with the  new Tropo Web API over web services. Build your app in any programming language you want, using any tools you want, using any libraries and connecting to your existing databases and business logic.

Here’s how it works.

We give you a phone number, and you give us a URL somewhere on the web where your app is located. When that phone number is called, our servers answer the call and make an HTTP POST to your URL, sending you a wealth of JSON metadata about the session. Caller ID, a session ID, that sort of thing. You return some JSON telling us what to do with the call. You’ve got complete control of the call. Say something with text to speech, play an audio file, start recording, ask a question and get the answer with voice recognition, even put the caller into a conference with other callers.

Conference? Yeah, that’s new, too. Build a conference call app in just a few lines of code. Create as many rooms as you want, put users into rooms, take them out of rooms. Mute them. Again, you’re in full control.

How easy is it to create a conference call? Here’s a conference call app:

{ "tropo":
 [{
 "conference": {"id": "myconferencecall" }
 }]
}

Put this code as conference.json on your server and point a Tropo app at it. Have all your friends call it and they can chat together. You can have more control if you want, of course. Ask for pin numbers. Put them in rooms based on their caller ID. Mute callers so they can only listen and hold a seminar. Whatever you can dream up.

Don’t like JSON? We’ve got a Tropo Ruby library (available on Gemcutter) that will create it all for you. Write your app using Ruby that looks and feels like a Ruby app should. Stick it on your server. We’ll do all the hard stuff. Libraries for other languages are coming. And if you write one for your favorite language, let us know and we’ll link it up.

It’s not just about voice.

With Tropo, the exact same code that makes and receives voice calls can send and receive instant messages and SMS. We’ve improved our IM and text message support. The conversation flow is even more natural now. You’ll get the first message just like any other, without having to jump through hoops to start a conversation. Want to know who you’re talking to? We provide you with the IM name or SMS number that contacted you and you can easily distinguish between voice sessions, text sessions and even know if the user contacted you via AIM, GTalk, Yahoo or other IM networks.

With today’s release, we’re continuing Tropo’s mission of making it easy for any developer to add real-time communications to any application. You can get the full documentation on building apps from Tropo Docs. Go put an app up and try it out. We’d love to hear from you as you create your apps and find out what great stuff you’re creating. Leave a comment here, email support@tropo.com or catch us on Twitter as @tropo.

AwayFind profiled

Posted on January 6, 2010 by Adam Kalsey

Tropo customer AwayFind has a great writeup on GigaOm’s Web Worker Daily. The WWD post was also picked up by the New York Times, Salon, and others.

AwayFind allows you to still get critical email while freeing you from checking your email a billion times each day by automatically categorizing and filtering your mail then using other communications channels to notify you of what’s important.

Depending on whether an email falls into the specific categories you choose, you can have a message sent to you by text message, IM or Twitter DM. You can even automatically direct messages to someone else if need be — if, for instance, a client needs help, you can have the message forwarded to technical support without needing to even see the email at all. And if you’re not near your computer, you can even arrange for AwayFind Orchant to call you with a particularly urgent message and read it off to you.

Nice work by Jared and the rest of AwayFind.

Advanced grammar topics for Tropo

Posted on December 17, 2009 by Adam Kalsey

This is a guest post from Dominique Boucher, Product Manager at Nu Echo.


Recently, Tropo has added support for SRGS grammars and JSGF grammars. Mike Thompson and Jason Goecke wrote about that a few weeks ago. In this post, I will go one step further and show some more advanced tools in the voice recognition ecosystem.

Authoring grammars

Once you’ve decided to add SRGS grammars to your Tropo application, one question arises: which format will you use to author them? XML or ABNF?

XML has the advantage of being supported by most major recognition engines on the market. It is often the native format for the engine. On the other hand, ABNF is much more compact and readable than the XML equivalent. For example, here is the same yes/no grammar expressed in both ABNF:

yesno-abnf

and XML (click to enlarge):

yesno-xml

I don’t know for you, but I much prefer the former.

At SpeechTEK’09, in August, Voxeo announced a partnership with Nu Echo to bundle NuGram IDE Basic Edition with the VoiceObjects Developer Edition. NuGram IDE is a complete environment for developing, debugging and testing voice recognition grammars in the ABNF syntax. (The basic edition, which is free of charge, can also be installed separately, directly from Eclipse.) With NuGram IDE, you write and test ABNF grammars on your desktop, without requiring a voice recognition engine. Once you are satisfied with your grammars, you integrate them in your application. And if you prefer to use their XML counterpart, just let NuGram IDE do the grunt conversion work for you.

Dynamic grammars

Most grammars used by voice applications are static grammars. By this, I mean that they do not change over time, nor do they depend on contextual, call-specific data.

Sometimes, however, the application needs to generate grammars on-the-fly. We call them dynamic grammars. Consider a voice-dialing application. The application identifies the caller, looks in the caller’s contact list, and asks for the name of one of his contacts. The grammar to use in the last step depends on the content of the caller’s profile.

But how are dynamic grammars served to the application? Well, usually a dedicated web application will be responsible for that. This can be a JSP or ASP page, a Ruby on Rails app, etc. In all these cases, a web application must be developed and made accessible to the Tropo runtime.

Another solution is NuGram Hosted Server (or NHS). It’s a free hosted platform specifically designed to serve dynamic grammars to cloud-based communication applications. So it nicely complements Tropo. All you have to do is create your dynamic grammar templates and push them to your NHS account once you have registered, all from within NuGram IDE (publishing grammars is done using a single keystroke — Alt-Ctrl-Shift P — from the IDE).

The dynamic grammar templates are expressed using a few extensions to the ABNF syntax. For example, the template for the voice-dialing grammar would look like (click to enlarge):

voicedialer-abnf

A tutorial describing the various templating directives is available on the NuGram website.

The client API

Generating a dynamic grammar from a template (instantiating a grammar) involves sending some data (the instantiation context) to NHS using an HTTP-based API. Fortunately, higher-level client APIs are available on Github in a variety of programming languages. (All you have to do is include the code of the API at the top of your Tropo application.)

To illustrate, here is a prototype voice dialing application that instantiates the grammar template above and uses the URL of the generated grammar formatted in XML form (click to enlarge):

voicedialer-rb

Lines 10-11 simply create a new session with NuGram Hosted Server. Line 19 retrieves the contacts for the current caller. Finally, line 26 instantiates the grammar with the contacts template and retrieves the URL of the generated grammar in XML form.

The get_contacts function simply returns a list of hashes of the form {'firstname' => "first name", 'lastname' => "last name", 'extension' => "phone number"}, one for each of the caller’s contacts. In our demo, the function makes a request to a web application that formats the contacts as a JSON string, and converts the data to a plain Ruby data structure (click to enlarge):

getcontacts-rb

Of course, in a real application, the data could be fetched from a Web service, a database, etc.

That’s it! The code of the whole application is also available on Github. The Ruby application called by the get_contacts function is a web application based on Sinatra and is readily deployable on Heroku.

If you have any question or comment, please leave a comment.