The telephone, which interrupts the most serious conversations and cuts short the most weighty observations, has a romance of its own. – Virginia Woolf
I have been writing communications software since the 1980s. I began working with proprietary protocols such as Northern Telecom’s Meridian Link and Applications Module Link before moving to “standards” such as Microsoft’s TAPI (Telephony Application Programming Interface) and Novell’s TSAPI (Telephony Server Application Programming Interface). Each toolkit had its own unique set of advantages, but they all shared the same problem. My software required direct access to a physical telephone system. Depending on what I was trying to accomplish, I had to either have a physical device I plugged into the back of a telephone (e.g. a Nortel MPDA card) or a wired Ethernet connection into a port on the PBX. Since this was all pre-Internet, I had to leave home and drive to work in order to get anything done.
These days, I work from home and just about everything I need in order to do my job is accessible from a myriad of cloud services. In fact, I am having a hard time remembering the last time I drove to the office to do more than grab a free lunch on employee appreciation day.
The cloud services I regularly lean on include artificial intelligence and bot building tools, Internet of Things platforms, SMS Text gateways, cognitive speech utilities, NoSQL databases, and IT Service Management platforms. With the rise of cloud communications, I have also been able to untether myself from the burden of physical ownership when it comes to my telephony needs. Gone is the need to configure and maintain gateways, trunks, stations, and voice processing servers. Now, everything I need to write communications software is on the other side of an IP address. And not only are these communications workings easy to access, I no longer have to worry about scaling up or down. If I want a new telephone number or voice service, I simply click a couple of boxes and voila, they magically appear. When I am done with them, another click sends them back into the virtual pool they sprang from.
I have been playing around with Avaya OneCloud CPaaS (formerly Zang) since it was launched a few years ago, but for the most part I stuck with the “easy” stuff. In other words, I built applications using Zang Workflow. Zang Workflow is very similar to Avaya’s on premise Breeze development tool, Engagement Designer. Both allow developers to drag communications widgets onto a browser-based programming canvas, configure the widgets, connect them together, and deploy a running application. This makes writing cool applications pretty darn easy, but it comes at a cost. While these drag-and-drop tools can create some pretty sophisticated applications, developers are constrained by the Breeze/Zang development sandbox and there are times when you want access to capabilities that aren’t available.
To immerse myself deeper into the platform, I decided to give Avaya OneCloud CPaaS (which I will simply call CPaaS from now on) another look and dig deeper into the additional ways to develop cloud-based communications programs. Specifically, I wanted to learn the ins and outs of what Avaya calls inboundXML.
Avaya OneCloud CPaaS inboundXML
It all begins with buying a telephone number. CPaaS makes acquiring a telephone number as easy as One-Click on Amazon. For as little as $1 a month ($2 for toll-free), you can purchase a telephone number that fully supports inbound and outbound voice, SMS text, and MMS text.
After buying a number, you configure how you want it to deal with voice and text. This is done by associating different URLs with each request type. For example, if you are creating an inbound voice application, you would configure a URL that points to a web application written to deal with incoming telephone calls. If your application needs to handle incoming SMS texts, you would configure a similar URL for text processing.
Application Development
The principle behind inboundXML application building is pretty straightforward. An application is initially invoked via the configured URL. Invocation is by a RESTful POST or GET. In either case, the query parameters contain information regarding the call or text. For a voice call these parameters include:
- CallerName
- To
- From
- CallStatus
- Direction
- ForwardedFrom
An invocation to an SMS application URL would contain similar information along with the SMS message body.
The application is free to do whatever it wants. This might include a database lookup of the caller or a status check on the availability of the destination.
Communication from the application back to CPaaS is accomplished with an XML-encoded response. Each tag in the XML directs CPaaS to do something. For example, if the application wants CPaaS to play a message to the caller, it would return a Say tag:
<Say voice=”woman”>Welcome to Andrew’s groovy call processing application.</Say>
If the application wants to collect six DTMF digits after the message is played, it would return a <Gather> tag:
<Gather method=”POST” numDigits=”6” action=”Gather completed URL”/>
The action parameter is a URL that is invoked after the digits have been collected. Think of it as a callback. Within the callback (and subsequent callbacks), you can return XML to move the call to the next stage.
Multiple tags can be returned in a single XML message. This might look like:
<Response>
<Say voice=”woman”>Welcome to Andrew’s groovy call processing application</Say>
<Gather method=”POST” numDigits=”6” action=”Gather completed URL”/>
</Response>
There are quite a few tags to choose from and each one can be a factor in how an application processes an incoming call. There are tags for:
- Call Recording
- Call Transcription
- Making and Redirecting Calls
- Releasing a Call
- Playing Tones and Recorded Files
- Sending SMS and MMS Messages
Many of the tags contain the ability to add additional callback URLs. For instance, you can designate a URL that is invoked when Call Recording has completed. The same can be done for Call Transcription.
Making Me Happy
I learn best by doing rather than by simply reading documents, I spent some time putting together a CPaaS inboundXML application. Since I am more than just a dial-tone guy, I added vehicle tracking IoT functionality from Cloudhawk and artificial intelligence/natural language processing from IBM Watson. While voice processing is still exciting to me, relevance comes from integration with digital transformation resources.
You can play with my application by doing the following. Note: I will do my best to keep it up and running, but it may be down for short periods of time. I am constantly tinkering with my software. If you receive a failure message, try again at a different time.
- From a cell phone, call 928-496-0124.
- You will be asked for a six-digit account number. Eventually, this number will associate the caller with a specific pool of IoT sensors. For now, you can enter any six digits to access the tracker and telemetry sensors in my home office.
- You are then asked, “How can I help you?” The bot handles the following spoken commands (and variants of each one since I am using IBM Watson natural language processing to understand the requests). There may be a slight processing delay after asking your question. Please be patient.
- Where is my truck?
- How fast is my truck moving?
- Where is my truck headed?
- What is the battery level of my tracker?
- What is the temperature?
- What is the humidity?
- Help
- Tell me a joke.
- Goodbye
- Upon hearing “Goodbye,” the telephony bot (i.e. incomingXML application) releases the call and sends an SMS text to your cell phone. The text contains a link to the recorded audio of the call.
I don’t know about you, but that’s a pretty cool demonstration of the power of flexible voice integrated with high powered cloud services. This is not Meridian Link nor your grandfather’s PBX.
Mischief Managed
I have only scratched the surface of what you can do with Avaya OneCloud CPaaS, but an IoT powered voice bot is a pretty good start. In future iterations, I plan on adding the ability to dynamically open trouble tickets and launch ServiceNow workflows.
If you are a voice guy like me, it’s never too late to spread your wings and think beyond on premise dial-tone. Avaya OneCloud CPaaS allows you to take communications applications well past what your old Call Manager, Communication Manager, or CS1000 ever could.
I like the demo, it did take a second to process but that fact that it works at all is amazing to me. I didn’t get the sms message on my iphone or android though. I don’t know how long it should take but I gave it about 10 minutes on each.
Thanks for posting the demo.
That’s odd. I will take a look at what might be happening with the SMS message. I haven’t had any problems with that in the past.
There appears to be a call recording issue in CPaaS. I have raised the issue with them and I am hopeful that a resolution will come quickly.
Got the sms on both devices this afternoon.
Perfect. I started getting text messages, too. Whatever was stuck in CPaaS became unstuck.
New to Avaya Cloud CPaaS, is there a repo in github where we can gain access to the demo you speak about?
This is a great place to learn. I put together several videos with programming examples. https://www.youtube.com/playlist?list=PLVAvmhXSk-dpZ3DrMNTigx1dDgGuXIZF_