What is a Chatbot? A chatbot is a conversational interface that can be used to interact with a product or service. While the term is well known in tech circles, to the outside world, chatbots are still a bit of a novelty. Recently, they’ve become increasingly popular thanks to advancements in natural language processing technology that make them much better at what they’re built to do.
These developments have led to the creation of new tools such as the Google Cloud Natural Language Processing API or the Stanford CoreNLP. For example, advancements in voice recognition services such as the Google Cloud Speech API is what has made some familiar vocal interfaces like Amazon Alexa, Google Home, or Apple Siri work so well.
In this post, I’ll be sharing some guidelines for designing an effective chatbot application. We’ll talk about the building blocks of dialog and I’ll share some best practices for designing the chatbot experience. Finally, we’ll look at three chatbot platforms—API.AI, LUIS, and Amazon Lex—to compare them based on features, performance, and pricing.
What is a Chatbot and why it is important
The great potential of chatbots lies in their ability to provide personalized and contextualized one-to-many communication using the most natural of interfaces: natural language.
One-to-many communication means that a single entity (a person or a business) can communicate easily with many people. Today, there are plenty of tools that make one-to-many communication possible, from email marketing to social media. But while it is easy to reach large audiences or even target particular groups, the hard part is communicating effectively at the individual level. Chatbots allow you to reach a large audience while keeping the singular conversation personalized and contextualized. This means that every single user will receive a personalized response from your chatbot.
Messaging is one of the most natural ways to communicate. Messaging apps like Facebook Messenger and WhatsApp are growing incredibly fast. Facebook Messenger, for example, is used by over 1 billion people every month and it is growing faster than Facebook itself.
With chatbots, you don’t have to worry about attracting users to a new messaging ecosystem. Instead, you can reach your users where they already are.
Let’s start by learning the basic terminology used in natural language processing (NLP) technology for describing the typical dialog structure employed by chatbots.
When we receive a message, the first thing we can do is predict the user’s intent. This means mapping the message into an action we understand that represents what the user wants to know, do, or achieve. The message, “Book a flight from London to Paris,” can be mapped to the intent “book flight.”
Every intent comes with a set of required parameters. For example, the intent “book flight” requires the parameters of “origin,” “destination,” and “departure date.” Parameters are extracted from the intent of the message. Missing parameters are explicitly requested from the user.
Context is a way to give the system a short term memory. If we are booking a flight, the conversation where we define the booking details falls into the single bucket “book flight.”
The session is one conversation from beginning to end.
Best Practices for Chatbot Design
In designing your chatbot, thinking ahead to user experience is essential. You need to simultaneously provide users with exactly what they want and a user experience that isn’t cold or robotic. You’ll also need to design the chatbot to handle some special use cases.
Tone of Voice
The tone of voice in your dialog defines the personality of your bot and therefore should reflect your brand. This can span from funny and very informal to extremely formal. If your brand already has a well-defined communication style, the choice of the tone of voice is easy. Otherwise, you can start from the text you already have such as taglines, slogans, and marketing copy. Just be careful to make sure your chatbot doesn’t sound like an advertisement! For inspiration, check out what members of your target audience have to say and how they communicate with one another in online communities in your product or service niche. Once you’ve chosen your tone of voice, it’s important to keep it consistent.
Tip. If you choose an informal style, emojis can be a very powerful communication tool.
Make Responses Easier
Buttons are a powerful way to express a choice, and because most messaging platforms support buttons, the majority of users will be familiar with them. Buttons are a great choice if you are asking a multiple-choice question or presenting several options to your users. You will also achieve the goal of keeping typing to a minimum, which makes it easier for the customer to respond and is less error prone.
Keep Conversations on Track
Conversations are truly limitless. Your chatbot will have a specific goal and will not be able to handle an infinity of possibilities. To keep the conversation on track, your chatbot should guide it into the flows that it can handle. A few tips for doing this:
- When starting a conversation, list the capabilities that the user can expect from the chatbot with sample messages.
- When asking questions, make sure that the question subject is clear and specific.
- Use buttons to answer questions when possible.
- Guide the conversations. Make your questions as specific as possible to guide the user.
No matter how much time you spend working on your chatbot or how much data you will use to train it, mistakes will happen. It’s normal and, if handled properly, sporadic errors are not a big problem. Here, my most important tip is to always provide an error message and an explanation—never leave users with a blank space.
A few tips for writing error messages:
- Show that you are sorry about what happened. Funny fail/sorry gifs, images, or emoticons could be effective here. Offer the user options for how to handle the situation, and reroute users to safe areas within your system. For example, you could offer to start a new conversation from scratch.
- If possible, transfer the user over to live support assistance.
Stick to the Truth
This is probably the most important tip of all. Here are a few tips for keeping it real:
- Start by making it clear that your users are chatting with a machine.
- Be clear about the capabilities and limitations of your chatbot. People will know what to ask your bot and you will avoid disappointing users who try to use features that don’t exist.
- Be transparent in dealing with errors (see the previous section).
Comparison of Major Platforms
In this section, I will analyze platforms from three major vendors: API.AI (acquired by Google), LUIS (Microsoft), and Amazon Lex.
API.AI (formerly Speaktoit) was founded in 2010 and focused on human-computer interaction through natural language conversations. Their first product was Assistant (by Speaktoit), a conversational assistant for mobile phones. In September 2014, Speaktoit made the service that powered Assistant public under the name API.AI. In September 2016, the company was acquired by Google.
LUIS (Language Understanding Intelligent Service)
LUIS is part of Microsoft’s Cognitive Services, a collection of tools whose goal is to “Enable natural and contextual interaction with tools that augment users’ experiences using the power of machine-based intelligence.” LUIS is a language understanding service, so it is very focused on NLP tasks such as intent recognition and slot filling, but it lacks useful features that can be used for deploying a chatbot. It should be used together with other Cognitive Services such as Bing Speech API for voice interactions and Microsoft Bot Framework for chatbot features.
Lex is Amazon’s service for building conversational interfaces into any application using voice and text. It is powered by the same algorithms used by Amazon Alexa and offers the integrations needed to quickly build a chatbot. It was released in April 2017, so we should keep this in mind during our comparison. An important feature of Lex is its integration with AWS services like AWS Lambda.
|Exporting/importing training data||✓||✓|
|Slot matching with ML||✓||✓|
All providers offer a visual interface for all the tasks required, which makes them easy to use, even for those who aren’t tech savvy. I personally found the API.AI and LUIS interfaces easier to use while Amazon Lex’s interface is a bit less intuitive. Sometimes Lex’s interface was unresponsive, but I think this is probably due to the fact that the service is still quite new.
API.AI and Lex offer some pre-built chatbots that can be used as a starting point for developing your own functionalities. API.AI has several pre-built options and also integrates a nice support system for small talk that can be used to easily make your bot look smarter. Lex has some useful built-in intents that can be used to handle frequent events that can happen during a task-driven communication (cancel, start over, help, etc.). LUIS has a lot (21) of pre-built intent/slot pairs that can add extra functionalities to your chatbot.
With API.AI and LUIS, you can import and export training data in JSON format. I found this feature really useful because it can be used to programmatically generate training data. Here, API.AI goes one step further, allowing you to not only import an app but also merge the current app into the imported one.
Integration with Messaging Platforms
|Actions on Google||✓|
API.AI has the most impressive set of direct integrations. Since LUIS is basically a language understanding API, it completely lacks this kind of feature. LUIS users should use the Microsoft Bot Framework (which is an SDK, not a cloud service) to easily create these integrations.
Programming Language Support
|iOS / Watch OS / Mac OS X||✓|
All three services offer SDKs for different languages. It should be noted that all the APIs are easy to use, so the presence of “official support” is not critical.
I put together a small dataset to compare the quality of the NLP processing of each platform. I used three different intents: “find a restaurant,” “find a hotel,” and “order a pizza.” Restaurant and hotel intents need to extract the name of the city from the message. This slot matching task is natively supported by all three platforms and it is quite easy to set up. Pizza ordering is more challenging because the engine will need to match the pizza type in each message. A couple of pizza types will be inserted as possible slot values.
Here are the intents provided as training data:
- Restaurant Milan
- Find me a restaurant in Paris
- I want to eat in London
- I would like to book a hotel in London
- I want to stay in Rome
- I want to book a hotel in London
- Hotel in Monaco
- I want a pizza siciliana
- I want a pizza napoletana
- I want to order a pizza margherita
These are the possible slot values provided during training:
- Pizza type
- quattro formaggi
Hotel Booking Results
All of the platforms performed well on this task. The only mistake was made by LUIS in the phrase “I want to stay in Como” where it wasn’t able to recognize that Como is a city in Italy (it’s also the name of the city’s famed Lake Como).
Restaurant Booking Results
API.AI got a perfect score in this task, LUIS did pretty well, while Lex seemed to struggle. The most mistaken phrase was: “I’m in Rome and I want to eat.” Both LUIS and Lex failed in matching the intent. It is true that they didn’t have a similar phrase in the training set, but the “I want to eat” part was in it and should have guided the matching.
Pizza Ordering Results
This task was the most challenging in slot matching because the slot “pizza type” contains custom values that probably aren’t present in the vocabulary of these platforms.
API.AI struggled in this more difficult task, while LUIS and Lex provided acceptable results. A test sentence that should be highlighted is “I want to order a pizza boscaiola” that looks a lot like the training sentence “I want to order a pizza margherita” except for the pizza type. It should be noted that the pizza type “boscaiola” wasn’t in the dataset, so its recognition poses a big challenge. API.AI recognizes the intent of this phrase but fails in recognizing the pizza type. This hint shows us that API.AI is only recognizing words in the training dataset as slots with no generalization. Lex fails completely on this sentence.
Instead, LUIS recognizes both the intent and the pizza type, which in my opinion, is an incredible result.
|Pricing||API.AI + Google Cloud Speech||LUIS + Bing Speech-to-Text API||Lex|
|Free text requests per month||∞||10,000||10,000|
|Price for 1,000 text requests||$0||$0.75||$0.75|
|Free voice requests per month||60 minutes||5,000||5,000|
|Price for voice requests||$0.006/15s||$0.004 per request (max 15s)||$0.004 per request|
Let’s try to make these numbers more concrete by plugging them into a plausible use case. Suppose we get 100,000 text requests and 30,000 speech requests each month, with an average length of 10 seconds. So, we receive 5,000 minutes of speech per month.
|Cost scenario||API.AI + Google Cloud Speech||LUIS + Bing Speech-to-Text API||Lex|
|Total cost of text requests||$0||(100,000 – 10,000)*$0.00075 = $67.5||(100,000 – 10,000)*$0.00075 = $67.5|
|Total cost of voice requests||(5000-60)*$0.024 = $118.56||(30000 – 5000)*$0.004 = $100||(30000 – 5000)*$0.004 = $100|
Of course, one can decide to use API.AI with Bing Speech-to-Text API, further lowering costs.
In my opinion, API.AI is the best service if you want to start quickly (it offers a lot of built-in functionalities) or if your chatbot doesn’t require a powerful slot matching algorithm.
LUIS has the most powerful NLP engine but requires more effort to build a fully functioning chatbot app. With LUIS, you will have to host the bot logic yourself and use different products to communicate with messaging platforms and to enable speech recognition.
Amazon Lex lies somewhere in the middle. With Lex, it is easy to get started and it offers support for the major messaging platforms and speech recognition out of the box. If you are already an AWS user and if you are used to AWS Lambda, Lex is probably the best choice for you.