In the previous weeks, Jason Goecke made a post regarding how to use Tropo’s Simple Grammar Engine to do some trivial voice recognition in your applications. In today’s blog, I will be showing you how to take that a step further, and implement some industry standard grammars and interpretation mechanisms. These grammar types will allow Tropo to utilize the same advanced level of speech recognition you might use or expect in VoiceXML applications today.
Before we get started with the examples, here is a list of the types of grammars (and return styles) which will be available to you:
GSL syntax is not considered to be a W3C-compliant syntax for grammars, and Nuance has discontinued support for GSL grammars in their most recent product offerings. Tropo will continue to support GSL-specific markup for some time to come, but it is strongly suggested that new applications and their associated grammars leverage the SRGS + SISR grammar syntaxes instead of being reliant upon the deprecated GSL grammar format.
The above being said, the example I will be showing you in this post will be Tropo utilizing an SRGS grammar with SISR returns. This is 100% W3C compliant, and is the industry standard for grammar development. Let’s start with our grammar:
Those of you familiar with grammars will likely notice this structure. If not, a great place to get started is here. The above grammar accepts the following utterances:
Red Sox, Boston Red Sox, Yankees, or New York Yankees
Based on the team you choose, you will get some information back about the team. Specifically, the value you would like returned for the team, the league they play in, their division, and standing. The grammar is quite simple, and I made it this way to illustrate the concept of using external grammars with your Tropo applications. Feel free to go as crazy as you want with these grammars.
How does one tie this grammar into a Tropo application? It’s easy! Let’s take a look at a basic Ruby app:
Notice when we declare our choices within “options”, I simply reference the remote destination of my SRGS/GRXML grammar with SISR returns. As soon as the prompt starts, we should be able to say any of the above utterances. When the result comes back, you can get the slot values (team,division,standing,etc) by accessing them directly:
That’s it! At this point, you should have the information needed to start developing your own Tropo applications with powerful voice recognition capability. If you have any questions at all, feel free to contact our free 24×7 Support team! We are more than happy to help you with any issues you may encounter!
Recently we did an upgrade to the Tropo platform that included the requirement to include a ‘+’ whenever making an outbound call from the Tropo cloud. This means that when dialing a number, even in the US, you must now format as ‘+14155551212’ when using the call or transfer method.
While Tropo supports RESTful Web Services as a form of moving data to and from the communication cloud, it may not always be fast enough for all applications. There are apps that require the lowest possible latency, for example, when mobile devices become input devices. The unique approach of Tropo, allowing developers to host scripts in our cloud, allows you the ability to write applications that take direct advantage of persistent sockets. This means that you may open the socket once and then stream data to your remote application in realtime without having to establish HTTP connections each time.
I recently created an example of this using Ruby to serve a socket using EventMachine, and then writing a script on Tropo that opens a socket and sends touch-tones (DTMF) down the socket immediately as they come in. Here it is in action:
The code examples may be found here.