1. Home
  2. Training Library
  3. Programming
  4. Programming Courses
  5. Assessing the Features of Android Devices with Kotlin and Java

Converting Speech to Text

Start course
1h 19m

In this course, we'll cover how to access the features of Android devices. You'll learn how to send an SMS, send an email, make a call, and convert speech to text on Android.

Intended Audience

This course is intended for anyone who wants to learn how to start building their own apps on Android.


To get the most out of this course, you should have some basic knowledge of the fundamentals of Android.



All right. Hello everyone. So, we got a doozy for you. In this video, we are going to learn how to translate speech into text. That's the subject of the 4th section, accessing the features of Android devices. So, guess what? In this video, we're going to reach the device's microphone and translate what is said into text and then print this text within the app, all through the Android application. So, this topic is appropriately called speech to text, right? In the literature. If you want to go deeper into the subject, you can just search for speech to text. But I think if you're anything like me, you're just going to want to get into Android Studio and get to the code.

All right, so you see here, I created a new project for this lesson in Android Studio. Name of the project: Speech to text. So, we're going to create everything from scratch on this project. I want to show you how the application works first before we get any further. So, there's one TextView and in the TextView, I will tell the user what to do. So, we'll say "you need to press the microphone button to translate speech into text," okay? So, there's a... let's call it a microphone looking thing, ImageView here. So, when the user clicks on the microphone, Google's speech listener opens. And what the user says here will be taken in, converted to text, printed on text view in our application. Now, obviously, this system is not really working as it is a virtual device. If you do try this application on a real device, you will see that it works. I mean maybe it can run on virtual devices on your computer because not all virtual devices have the same features. But if you do happen to have a device, go ahead hook it up. Otherwise, let's get developing.

So, first I want to add a text view component to the design area. Width of the TextView can be 350dp. I'm changing the text of the TextView. I'm going to write, "Please tap microphone to speak." And you can write the description that you want to hear. So, we'll also increase the text size of the  TextView, it can be 20sp, and the text color will be black. I'll also add an image button, but I got to do something first. So, I want to right click the drawable folder and select the new 'Vector Asset' option. All right. So, it's in this section that I will add the microphone icon as an XML file, which I will add to the image button as the background.

So, when you click on the Clip Art here, you will see icons related to each area, right? So, I'll choose microphone. Now, we can also change the color of the microphone. For example, I can turn it blue. And let me just change the name of the icon too I'll write "mic" for the name. So, when I click 'Next' and 'Finish,' an XML file is created under the drawable folder. All right, so I'll add an image button now. The source of the image button will be the XML file that I created. And that way, a mic shaped button gets formed.

So, let's make the height, the width of this button, 150dp Also, I'll select the button and just scaleType option: fitCenter. Okay, so that is the mic icon. Center the button, looks pretty good. Why don't we make the background tint color of white? All right, so now determine the constraints of these components. I want to first center these components on a horizontal plane. And also I want to determine the vertical constraints. So, the top constraint value of the image button can be 20dp. Top constraint value of the TextView can also be 20. Then finally, let's check out their IDs. So, the ID of that TextView is textView. The ID of the image button is imageButton. Cool. And there you go, that's the design part. All complete.

So, let's go to Kotlin codes. First, I'll define the components called TextView, resultText because what I'm going to do is print the text of the speech again in this TextView. And then in the onCreate method, I'll match the components with their ID. And of course, I will add a ClickListener to the image button. All right, cool. So, nothing new here, right? You would probably be able to do all of this without my guidance and that's cool. But now, we're going to get into this part of translating the speech into text.

So, I'm creating a new function for this process. Name of the function is going to be convertSpeech. Now, it's going to be a function without parameters. We will use the Intent method to translate this speech into text. So, first I'll create a new intent: val intent = Intent. Also, I'll right inside the parentheses: RecognizerIntent.RECOGNIZE_SPEECH. So, that way the speech listener is going to be launched. So, I'll need to define certain features of the speech listener, and that's going to be launched later. So, for this, I need to write intent.putExtra, and after that I write: RecognizerIntent.EXTRA_LANGUAGE_MODEL. So, here I can define which model the speech listener will work on. So, after this comma,  I'll write LANGUAGE_MODEL_FREE_FORM So, in other words, the speech listener will work in the freeform model,

okay? So, after typing: intent.putExtra on the bottom line again, I'll write: RecognizerIntent.EXTRA_LANGUAGE And here, I define the language in which the converter will work. So, I'll write: local.getDefault, after the comma. So, whatever the language of the device is, this speech listener will work from that language. For example, if the language of the device is English, the listener will listen to the conversations, translate them into text in English. The language of the device is Turkish, this time it will listen to Turkish, translate it into Turkish, you know what I'm saying?

So, the intent process is complete. But wait, we need to run this Intent. So, here we need to start the Intent with a different method other than the start activity method. You see, because after the user speaks into the microphone, the expressions they say will be converted to text. So, we need to take this text and display it in the TextView. So, we'll not only start the Intent but also follow the outcome of the Intent. So, at this point, there's something that I need to tell you. To track the result of the Intent, we used the start activity for result method before. But see this method is now deprecated. Instead, there's an ActivityResultLauncher. And this is a class it will now use. I just want to show you both methods because that way you'll know, all right? So, we're going to use the startActivity for result method first.

You might be able to guess your way through, but I'll show you this way. So, of course the method takes two parameters: first is Intent, second is resultCode. So, I'm writing a requestCode. You can write any number here. You only need to request the same requestCode while getting the results. So, the speech was translated into writing. Now, let's take the translated text and use it wherever we want. Okay, so I'm going to do this by override onActivityResult method outside the onCreate function. So, I'm writing an if condition first. All right. If requestCode == 1. Because we wrote one in the requestCode above, I have to shoot with the same code. And, if resultCode == RESULT_OK. That is, if the result comes true and lastly, if the data != null, we will determine the actions that we will want to perform. First thing that we'll do is create a string ArrayList. So, this ArrayList keeps the translated speech in strings. I'll write speakResult as a name for the ArrayList. I'll transfer the data to this ArrayList. I'll get the data with the data parameter. After the equal sign, I'll write data.getStringArrayListExtra, okay?

So, the parameter of this method will be  RecognizerIntent.EXTRA_RESULTS. And now we've got to get in here and do type conversion. We should get the user's conversations as an ArrayList. So, that's why I click on the red light bulb here. So, I'll choose the cast option. Now since the code here do not fit the screen, we can go to the bottom line just by pressing 'Enter' from here. All right, so we've taken this data, we've assigned it to the string ArrayList.

Now we need to write the strings inside of this ArrayList to the TextView component. So, the name of the TextView was resultText. And now I'll print on this result TextView. All right. resultText.text. And come in here after the equals sign, I'll write speakResult[0]. So, this way I'm writing the strings zero index, which is the first element over the TextView, okay? Now, printing the results is also going to be okay. Okay. So, as you can see this is how the start activity for result method gets used. But now let's try to get the result of the  Intent by using the up-to-date activity result launcher method.

So, first of all, we'll create an object from the activity result launcher class in the global area. Here, I'll write lateinit var activityResultLauncher : ActivityResultLauncher. We also need to specify the type of this class. We want to know the result of  Intent. So, in other words, data from the  Intent class will be returned, so its type will be  Intent. Now we'll need to register it in the onCreate method. Now, this is important. You must register the ActivityResultLauncher object in the onCreate method otherwise, your application will not work. So, let's get to it then.

So, I'm typing ActivityResultLauncher = then after the equal sign, I'll write RegisterforActivityResults to register the  object. Notice this method takes two parameters. The first parameter is ActivityResultContract. So, if you type StartActivityForResult here and press 'Enter', you can easily create the first parameter, and this is a standard parameter. Second parameter is a Callback. So, we will get the returned result with this parameter. Therefore, I'll write ActivityResultLauncherCallback here and press 'Enter'. So, as you can see, the Callback has occurred.

Now, here we can capture the result using the it keyword. We can even create the Lambda representation ourselves here. So, I'll write results -> and put the lambda sign. All right. So, we can get the result of the  Intent just using the keyword result. So, first let's get the result code and data. So, here I am writing val resultCode = result.resultcode. In the next line, I'll write val data = result.data. Now, let's check this data again. Using an if statement. if resultCode = RESULT_OK. And the data is not null. We can take the incoming data and use it as we want. So, the action that will do here will be the same as what we did in the onActivityResult method. That is we're just transferring the data to an ArrayList and printing it to the textural component, right? That's why I'm just copying the code that I wrote here and pasting it here.

All right. So, finally we'll need to launch the ActivityResultLauncher. So, that means I got to write ActivityResultLauncher.lauch I'll write, the  Intent object in the parantheses and boom, just like that. That's it. You now know both methods. Now in our course, we're always going to use the second one and that is the current method from here on out. But you are likely to see the use of the older method in different sources, okay? So, therefore it's useful to know this as well.

So, now let's just continue from where we left off. All right, so I want to convert the codes that I wrote for the old method to the command line. And now let's call the convert speech function inside the buttons ClickListener. Now we can run the application. All right, so see the system works when you press a microphone in the application. Of course, like I said, it doesn't actually work on the virtual device. It just shows you that it is working. If you do test on a real device, however you will see it works. So, I hope you do that. Okay, so that's the end of this lesson. Have fun with that, okay? Go back through it and have some more fun. I'll see you in the next video.


About the Author
Learning Paths

Mehmet graduated from the Electrical & Electronics Engineering Department of the Turkish Military Academy in 2014 and then worked in the Turkish Armed Forces for four years. Later, he decided to become an instructor to share what he knew about programming with his students. He’s currently an Android instructor, is married, and has a daughter.

Covered Topics