The course is part of these learning paths
This course introduces machine learning on Google Cloud Platform.
Learning Objectives
- What machine learning actually means
- The types of problems machine learning can solve
- The basics of how machine learning works
Intended Audience
- Anyone interested in machine learning
Prerequisites
- A basic understanding of computers
Alright, in this lesson I am going to provide a practical demonstration of Machine Learning. Now I have downloaded some code for building an image classifier. Specifically, I can use this to build a model that will determine if a photo contains either a dog or a cat.
Now, this demo is going to focus more on using the code, not writing it. Creating your own models from scratch takes some pretty advanced coding skills and a lot of math.
Alright, so here is the package that I downloaded. I am not going to provide the link to the Github repo because the code was actually broken. I had to make a bunch of changes to make it work. But this is not unexpected. Machine learning is evolving very quickly. Libraries and frameworks are constantly updated, causing things to break. Even if I shared my fixed code with you, by the time you tried to use it, something else would probably be broken.
Notice this code is all written in Python. Python is a very popular language for machine learning. You can see here is a script for training. This is what will build my model. And here is a script for predicting. That is, it will use the model after I build it, to classify new images. So I need to run the train script first. And then I can use the predict script to tell me if a photo contains either a cat or a dog.
Now, these scripts require images to work. For example, train.py requires a bunch of images of cats and dogs to learn from. It is also going to require some photos to use for testing and validation. Now I have already downloaded a bunch of photos and placed them in the “catsdogs” directory. Let me show you that. Alright, so here in the sample directory, I have a training directory and a validation directory. The training directory has a set of cat photos. And a set of dog photos. These are the images that the model will use to identify patterns in.
The valid directory also has its own set of cat and dog photos to use for validation. So the script will read in a small amount of photos from “train” so that the model can learn. And then it will test the accuracy of the model with some photos from “valid”. This process will repeat over and over again. Each iteration is called an epoch. And hopefully, after each epoch, we will see the accuracy rate rise as the model improves.
Now before I run the scripts, let’s take a peek inside. So inside of “train”, you will notice that we are using a library called Keras. Keras is like a high-level interface for the Tensorflow library. TensorFlow is a very popular, open-source software library for machine learning. It gives you access to many powerful features, and it was developed by Google. Keras makes Tensorflow a little easier to use and handles some of the complexity for you.
So here is where it imports the needed libraries. Here are some defined variables. Notice the generated model will be saved to a file called “model-resnet50-final.h5”. Here is where it defines the training and validation batches. So this will load in my cat and dog photos from the “train” directory. And this will load in the testing photos from “valid”.
Then if I scroll down a little bit, you can see that this script is actually using a pre-built model called ResNet50. ResNet50 is a residual neural network that is 50 layers deep. It is really good at identifying photos. So because I am not starting from scratch, getting a working model is going to be a lot easier. I am not gonna have to do a lot of tweaking with the algorithm. Plus, I am only going to need about 200 photos, and the whole training process shouldn’t take more than ten minutes.
These few lines of codes are for tweaking some values to better fit the use case, but the bulk of the work is actually done here. ResNet50 will look at our cat and dog photos, and learn how to tell the two apart. After it has figured everything out, it will then save the model to disk.
Let’s also take a look at predict.py. First, it imports the Keras libraries. I will be passing in file names, so it has to parse those from the command line. Here is where it will open the model file. And then it just loops through the list of filenames and it will start making predictions based on the model. Pretty simple.
Now that you understand what is going to happen, let’s get started. First I will run the train script to build the model. It spits out a few warnings about not finding any attached GPUs. But then it starts going through the “epochs”. By breaking training up into epochs, I can verify that I am actually making progress. Fully training a model usually takes a significant amount of time to complete. But it depends on how much CPU and RAM your machine has. You can speed things up by adding one or more GPUs (or graphics cards). Google also has TPUs (or tensor processing units) that you can use to further accelerate training. Of course, all this costs money. So you have to figure out if speeding things up is worth the additional cost. Because everything is broken up into epochs, it allows me to abort the training early if necessary.
Training on this machine is going fairly quickly, but I am still going to speed up the video a little so you don’t get bored. You can see each epoch shows the loss and accuracy for the training data. As well as the loss and accuracy for the validation data. Basically, you want the loss to get lower and the accuracy to get higher. It looks like my model is currently over 80% accurate. Now it’s over 90%. And we are done.
It appears that my model ended up being about 97% accurate when trying to predict my test images. So it seems like it should be pretty good. Well, let’s verify that by running a few predictions. Now I have downloaded a couple of images to test out my new model.
Of course, using photos that are too obvious would be a bit boring. So first, I downloaded this picture of a cat hiding under a blanket. You can clearly see the face, but the rest of the body is hidden and the furry blanket on top might confuse the model. Next, I also downloaded a picture of a dog, but this one is a pug. So to me it sort of looks like a cat. Let’s see if my new model can correctly identify both images.
I am going to start with the cat image first. Alright, it says that there is a 69.8% chance that it is a cat. And only a 30.2% chance it is a dog. That seems fairly reasonable to me. Remember, it only was able to see the face. It wasn’t able to look at the body shape, paws or tail. Ok, next let’s try the dog picture. Well here it is 98.9% sure that it is a dog. And only gives it a 1.1% chance of being a cat.
So that seems pretty good. Especially considering the low training time and small amount of photos I used. I would say this model is a success. Now let’s say my model was bad at predicting my final pictures. At that point, I might need to go into the code and tweak a few things. I might also have to change some of my training and validation photos. You need a good amount of training data so that the model can learn everything it needs to know. You also need a good amount of testing data so that you can determine that the model is accurate. Now, if in the future there were new breeds of dogs or cats, I might need to retrain this model with new pictures to keep it accurate.
So you should now have a good idea of how machine learning works in practice. Obviously, there is a lot of “magic” happening inside the code. And you should be aware that normally it does not work on the first try. You usually have to change a lot of things. Experiment a bit. To get something that actually works.
Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.
Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.
When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.