Tropo is part of CiscoLearn More

Speech-Driven Phone Applications in the Cloud

Posted on January 5, 2011 by Adam Kalsey

Last week, I wrote a post about using Tropo with SRGS grammars to create cloud-based speech recognition applications.

Speech in the Cloud

In my post, I detailed one approach to capturing a caller’s address using street information contained in a MySQL database. Over on the Nu Echo blog, Dominique Boucher demonstrates a variation on this approach that uses Tropo, the NuGram platform and CouchDB.

Dominique’s approach is more fully cloud-based than my own because of some components he chose to use:

  • NuGram Hosted Server – this lets an application developer serve grammars dynamically to a voice recognition platform (like Tropo) directly from externally hosted servers.
  • CouchDB – instad of using a relational database (as I did in my example), he chose to use CouchDB. And although his example uses a local instance of CouchDB for demonstration purposes, this approach could very easily be adapted to use a hosted version of Couch from CouchOne, CloudAnt or any other cloud-based CouchDB provider.

As a result, with Dominique’s approach it is possible to build a speech recognition app that captures a caller’s address that is completely cloud-based.

And while it is possible to host a MySQL instance in the cloud, the choice of CouchDB as the data store for an address capture application might also have some more practical benefits as well.

Lots of municipal governments store geographic information in ESRI Shapefile format. So if you were looking for a data set on which to base your streets grammar, it’s entirely possible that you would run into a file (or files) of this type.

Converting shapefiles to CouchDB is pretty straightforward when you use tools like shp2geocouch – something I demonstrated in a previous post.

Dominique’s post further underscores the power of the Tropo platform, and the ease with which it can be integrated with powerful components like NuGram Hosted Server and CouchDB to build speech-driven applications.

Leave a Reply