Enabling Speech

Contents

keyboard_tab
Creating a Chatbot on Azure
1
Introduction
PREVIEW1m 29s
2
3
QnA Maker
4m 44s
7

The course is part of this learning path

play-arrow
Start course
Overview
DifficultyIntermediate
Duration24m
Students343
Ratings
4.8/5
star star star star star-half

Description

Overview

The ‘Building a Chatbot on Azure’ course will allow team members to learn how to automate basic support tasks by using chatbots to answer typical questions about their products and/or services.

In this course, you will learn to create a chatbot to answer support questions about specific products and services. Along with this, you will learn how to combine the Azure Bot Service and Azure QnA Maker and to add speech input and output capabilities to help customers on mobile devices and those with impaired sight.

This course is made up of 5 lectures that will require some previous knowledge of Azure and coding.

Learning Objectives

  • Create and configure an Azure QnA Maker knowledge base
  • Create an Azure Bot Service chatbot that answers questions
  • Enable speech recognition and synthesis on an Azure chatbot

Intended Audience

  • Those interested in artificial intelligence services on Azure, especially chatbots

Prerequisites

  • Previous experience using Azure
  • Previous experience with writing code

Resources

The GitHub repository for this course is at https://github.com/cloudacademy/azure-recommendation-engine.

Transcript

The last step to accomplish our goal is to get our chatbot to support speech input and output. To add speech to our web chat, we have to add another channel, so go back to the Channels page for your bot.

 

To connect a custom app to your bot, you need to use the Direct Line channel. This is the icon for it. Once again, it has secret keys so your code can authenticate with the bot.

 

The code for the speech-enabled version of the web chat is much longer than the other one, so I put it in the github repository for this course. By the way, I didn’t write this code. It’s a sample from Microsoft. Download index.html from the repository. Then go back into the App Service Editor, right-click in the files area, and select “Upload Files”. Select the index.html file.

 

Let’s have a look at the code. If you scroll down to about the middle of the file, you’ll see five options for speech. All of them should be commented out except for one. At the moment, option 2 is uncommented. This is the easiest option (other than option 1, which is “no speech”), but it only works on the Chrome browser.

 

If you’re not already in Chrome, then open it and paste the app’s URL in the address bar. Then go back to the Direct Line channel configuration and copy the first secret key. Now go back to the address bar in Chrome, add “?s=” and then paste the secret key.

 

First, let’s make sure it’s connected to the bot. Type “languages” again. Good, it’s working. Now let’s try speech input. Let’s ask it something more interesting, like how to remove a bot. When you click the microphone icon, your browser might ask you if you want to allow it to use your microphone. If it does, then click “Allow”.  Great, it read the answer too.

 

Now let’s go back to the code and check out the next option. First, comment out Option 2 and uncomment Option 3. Option 3 works on a variety of browsers, so it’d be a better choice for a customer-facing chatbot. It uses Cognitive Services speech recognition, so we need to create an instance of that service before we continue.

 

Go to the portal and search for “speech”. The one we want is “Bing Speech”. And click “Create”.

 

Now go to the resource so we can get a key. Click “Keys” and copy the first key. Then go back to the editor and paste the key into these two places. The editor saved the file, so go back to the website tab, copy the URL, and paste it into a browser other than Chrome.

 

It was pretty easy to tell that we used a different speech service that time, right? If you look at the code again, you’ll see that it specified a female voice.

 

I’m not going to demo option 4 because it requires significantly more setup, but I’ll tell you about it. The problem with embedding the secret key in your web page is that anyone can view the source and find out what it is. Then they could access your speech service without going through your web page.

 

The solution is to retrieve a token using a secure backend that doesn’t expose the secret key. The token would expire, so it couldn’t be used to access your speech service from somewhere else. You’ll notice that the code here includes the secret key again. That’s because it’s just showing you what code to put in the secure backend. You’re not supposed to leave this part of the code in this web page.

 

There’s a similar problem with the secret key for the Direct Line channel that you had to add to the end of the URL in the browser. In order for your users to access your bot, you’d have to give them the secret key as part of the URL. To get around this problem, your code would have to get a token using a similar mechanism to what’s shown for the speech service token.

 

OK, just for completeness, option 5 is to use another speech service, so there isn’t much code here.

 

And that’s it for building our chatbot.

About the Author

Students22539
Courses43
Learning paths29

Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).