Advanced grammar topics for Tropo
December 17th, 2009 by Dominique BoucherThis is a guest post from Dominique Boucher, Product Manager at Nu Echo.
Recently, Tropo has added support for SRGS grammars and JSGF grammars. Mike Thompson and Jason Goecke wrote about that a few weeks ago. In this post, I will go one step further and show some more advanced tools in the voice recognition ecosystem.
Authoring grammars
Once you’ve decided to add SRGS grammars to your Tropo application, one question arises: which format will you use to author them? XML or ABNF?
XML has the advantage of being supported by most major recognition engines on the market. It is often the native format for the engine. On the other hand, ABNF is much more compact and readable than the XML equivalent. For example, here is the same yes/no grammar expressed in both ABNF:

and XML (click to enlarge):
I don’t know for you, but I much prefer the former.
At SpeechTEK’09, in August, Voxeo announced a partnership with Nu Echo to bundle NuGram IDE Basic Edition with the VoiceObjects Developer Edition. NuGram IDE is a complete environment for developing, debugging and testing voice recognition grammars in the ABNF syntax. (The basic edition, which is free of charge, can also be installed separately, directly from Eclipse.) With NuGram IDE, you write and test ABNF grammars on your desktop, without requiring a voice recognition engine. Once you are satisfied with your grammars, you integrate them in your application. And if you prefer to use their XML counterpart, just let NuGram IDE do the grunt conversion work for you.
Dynamic grammars
Most grammars used by voice applications are static grammars. By this, I mean that they do not change over time, nor do they depend on contextual, call-specific data.
Sometimes, however, the application needs to generate grammars on-the-fly. We call them dynamic grammars. Consider a voice-dialing application. The application identifies the caller, looks in the caller’s contact list, and asks for the name of one of his contacts. The grammar to use in the last step depends on the content of the caller’s profile.
But how are dynamic grammars served to the application? Well, usually a dedicated web application will be responsible for that. This can be a JSP or ASP page, a Ruby on Rails app, etc. In all these cases, a web application must be developed and made accessible to the Tropo runtime.
Another solution is NuGram Hosted Server (or NHS). It’s a free hosted platform specifically designed to serve dynamic grammars to cloud-based communication applications. So it nicely complements Tropo. All you have to do is create your dynamic grammar templates and push them to your NHS account once you have registered, all from within NuGram IDE (publishing grammars is done using a single keystroke — Alt-Ctrl-Shift P — from the IDE).
The dynamic grammar templates are expressed using a few extensions to the ABNF syntax. For example, the template for the voice-dialing grammar would look like (click to enlarge):
A tutorial describing the various templating directives is available on the NuGram website.
The client API
Generating a dynamic grammar from a template (instantiating a grammar) involves sending some data (the instantiation context) to NHS using an HTTP-based API. Fortunately, higher-level client APIs are available on Github in a variety of programming languages. (All you have to do is include the code of the API at the top of your Tropo application.)
To illustrate, here is a prototype voice dialing application that instantiates the grammar template above and uses the URL of the generated grammar formatted in XML form (click to enlarge):
Lines 10-11 simply create a new session with NuGram Hosted Server. Line 19 retrieves the contacts for the current caller. Finally, line 26 instantiates the grammar with the contacts template and retrieves the URL of the generated grammar in XML form.
The get_contacts function simply returns a list of hashes of the form {'firstname' => "first name", 'lastname' => "last name", 'extension' => "phone number"}, one for each of the caller’s contacts. In our demo, the function makes a request to a web application that formats the contacts as a JSON string, and converts the data to a plain Ruby data structure (click to enlarge):
Of course, in a real application, the data could be fetched from a Web service, a database, etc.
That’s it! The code of the whole application is also available on Github. The Ruby application called by the get_contacts function is a web application based on Sinatra and is readily deployable on Heroku.
If you have any question or comment, please leave a comment.
Related posts:



