Enabling Speech

Start course
Overview
Difficulty
Intermediate
Duration
23m
Students
115
Ratings
5/5
starstarstarstarstar
Description

In this course, you will learn how to create a chatbot to answer support questions about specific products and services. Along with this, you will learn how to combine the Azure Bot Service and Azure QnA Maker and to add speech input and output capabilities to help customers on mobile devices and those with impaired sight.

This course requires some previous knowledge of Azure and coding.

Learning Objectives

  • Create and configure an Azure QnA Maker knowledge base
  • Create an Azure Bot Service chatbot that answers questions
  • Enable speech recognition and synthesis on an Azure chatbot

Intended Audience

  • Those interested in artificial intelligence services on Azure, especially chatbots

Prerequisites

  • Previous experience using Azure
  • Previous experience with writing code

Resources

The GitHub repository for this course is at https://github.com/cloudacademy/azure-chatbot.

Transcript

The last step to accomplish our goal is to get our chatbot to support speech input and output. First, I’ll show you a simple way to add speech, although it’s not suitable for production use, and then I’ll explain how you would use a more robust method.

The code I’m going to use for a speech-enabled version of the web chat is much longer than the other one, so I put it in the GitHub repository for this course. By the way, I didn’t write this code. It’s a sample from Microsoft. Download speech.html from the repository. Then go back into the App Service Editor, right-click in the files area, and select “Upload Files”. Select the speech.html file.

This code only works on some browsers, such as Chrome, Edge, and Safari. In one of the supported browsers, paste the app’s URL in the address bar. Then go to the Web Chat channel configuration and copy the first secret key. Now go back to the address bar in the browser, add “?s=” and then paste the secret key. You wouldn’t want to use this method in production because you’d have to reveal this key to all of your users, which would be a security problem.

First, let’s make sure it’s connected to the bot. Type “languages” again. Good, it’s working. Now let’s try speech input. Let’s ask it something more interesting, like how to remove a bot. When you click the microphone icon, your browser might ask you if you want to allow it to use your microphone. If it does, then click “Allow”. Great, it read the answer too.

Now, to add speech to your bot in a production-ready way, here’s what you’d need to do. First, to make sure it will work on all browsers, create an instance of the Speech Service, which is part of Cognitive Services. Then create a Direct Line Speech channel in your bot and connect it to the Speech service you created. Next, make sure that your App Service has web sockets enabled so that the bot can communicate with the Direct Line Speech channel. You also need to select “Enable Streaming Endpoint” in your bot’s configuration.

Finally, you need to add code to your bot that uses the Speech SDK to do speech recognition and speech synthesis. The code will have to authenticate with the Speech service first. To avoid having to expose the secret key for the Speech service, this code would need to retrieve a token from the Speech service. The token would expire after 10 minutes, so a hacker couldn’t use it to access your speech service after that.

It takes quite a while to set all of this up, and it’s outside the scope of this course, so I won’t be demonstrating it. If you’re interested in learning more, you can read the tutorial at this URL.

And that’s it for building our chatbot.

About the Author
Avatar
Guy Hummel
Azure and Google Cloud Content Lead
Students
117834
Courses
66
Learning Paths
88

Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).