Training Your First Neural Network
Scaling Up with AI Platform
The course is part of these learning paths
Machine learning is a hot topic these days and Google has been one of the biggest newsmakers. Google’s machine learning is being used behind the scenes every day by millions of people. When you search for an image on the web or use Google Translate on foreign language text or use voice dictation on your Android phone, you’re using machine learning. Now Google has launched AI Platform to give its customers the power to train their own neural networks.
This is a hands-on course where you can follow along with the demos using your own Google Cloud account or a trial account.
- Describe how an artificial neural network functions
- Run a simple TensorFlow program
- Train a model using a distributed cluster on AI Platform
- Increase prediction accuracy using feature engineering and hyperparameter tuning
- Deploy a trained model on AI Platform to make predictions with new data
- The GitHub repository for this course is at https://github.com/cloudacademy/aiplatform-intro.
- December 20, 2020: Completely revamped the course due to Google AI Platform replacing Cloud ML Engine and the release of TensorFlow 2.
- Nov. 16, 2018: Updated 90% of the lessons due to major changes in TensorFlow and Google Cloud ML Engine. All of the demos and code walkthroughs were completely redone.
TensorFlow is open-source software and its documentation is at tensorflow.org. One thing to keep in mind when you’re looking at TensorFlow examples is that TensorFlow has several different APIs. The low-level TensorFlow API gives you complete flexibility to build neural networks, but it requires a lot of coding and it’s harder to understand. Fortunately, there are quite a few high-level APIs, such as Keras and tf.estimator, that greatly simplify many tasks, require much less code, and are easier to understand. In this course, we’ll be using the Keras API.
To get started, we’re going to go through an example from the TensorFlow website. It creates a neural network that models a classic machine learning data set known as the Iris flower data set, which contains the lengths and widths of the petals and sepals of 150 irises. There are 50 samples each of three different species of irises: Iris setosa, Iris versicolor, and Iris virginica. Back in 1936, a man named Ronald Fisher developed a statistical model to distinguish the species from each other based on these four measurements.
This is a slightly different type of problem than the home value example I gave you earlier. In that example, the model needed to predict a number, the value of a given home. This type of model is called a regression model. With the iris example, the model needs to predict the species of a given iris rather than a number. This is called a classification model.
Okay, so let’s go ahead and build a classification model. First, we have to install TensorFlow, which is a Python library. You probably already have Python installed, but you’ll want to check the version. You’ll need Python 3.7 or higher. You can see which version you have installed by running “python3 -V”.
Okay, now there are a couple of different ways to install TensorFlow. If you want to keep your TensorFlow development isolated from the rest of your Python environment, then you can install it in a virtual Python environment. That way, you can install different versions of Python libraries than what you have in your native Python environment and won’t have to worry about the libraries needed for TensorFlow interfering with the libraries needed for your other Python applications or vice versa.
That’s the way I’m going to do it. To install the virtual environment, we need to use pip3, the Python package installer. Before using it, you should upgrade pip3 itself to the latest version using this command. You can copy this from the readme file in the GitHub repository for this course, if you’d like.
Now, use this command to install the virtualenv package. If you already have virtualenv installed, it’ll upgrade it. To create your virtual environment, type “virtualenv” and then what you want to call the environment. I’ll call it “mlenv”. Now you can go into that virtual environment by typing “source mlenv/bin/activate”. You can tell that we’re in that environment now because it says “mlenv” in brackets over here. If you need to exit the virtual environment at some point, type “deactivate”.
Alright, we’re finally ready to install TensorFlow. All you have to do is type “pip3 install tensorflow==2.2”. This installs TensorFlow version 2.2, which isn’t the latest version, but it’s the version that the examples in this course work with.
The installation will take a while, so now let’s have a look at the sample code for the Iris dataset in the GitHub repository. The code does 5 things: import and parse the data, create the model, train the model by running lots of data through it, evaluate the accuracy, and use the trained model to make predictions. I’ll show you which parts of the code do these things.
First, there are a few import statements, one of which is for TensorFlow. Then there’s a short section for handling command-line arguments. I’ve only added one argument to the script. It’s called “job-dir”, and you use it to tell the script where to save the model after it’s trained.
Next, we download the training dataset, which is in a csv file. This is actually a subset of the 150 flowers in the full dataset. The training dataset contains 80% of the iris samples. The test dataset contains the other 20%. You typically want to train your model using the majority of the data, usually 70-80% of it, and then evaluate its accuracy based on some data that it hasn’t seen before, which is why you hold back 20-30% for that purpose.
Once we’ve downloaded the training file, we define what’s in each column. The first four columns are the lengths and widths of the sepals are petals, which are the features of each flower. The fifth column says which of the three iris species each flower is. We need this so we can compare the model’s guess with the correct classification. This correct answer is known as the label. In this dataset, the label is a 0, a 1, or a 2, for Iris setosa, Iris versicolor, or Iris virginica, respectively.
Now that we’ve defined everything, we can read the data into a TensorFlow Dataset. Unfortunately, the way that it structures this dataset is not quite what we need to run it through the model, so we need to run this function on it to turn it into what we need. The details aren’t important because this example is just intended to give you an overview of how TensorFlow programs work. Then we go through the same steps to read in the test dataset, which contains the other 20% of the iris dataset.
Okay, now we can create the model. Amazingly, this is all you need to construct a neural network. That’s because we’re using the high-level Keras API. If we were using TensorFlow’s regular API, it would take a lot more code.
Instead of having to write the code to build a neural network, we can just tell it to build one for us. First, we say that it’s a Sequential model. This means that it will contain a linear sequence of layers. In this case, we’ll have four layers. The first one is the InputLayer. It contains the four features of the flowers, so it contains four nodes, one for each feature.
The remaining layers are Dense layers. This means that every node in the previous layer is connected to every node in this layer. For example, the second and third layers have ten nodes each, so there are 100 connections between them, although it doesn’t show all of the nodes and connections in this diagram. The second and third layers are known as hidden layers, which is a deep neural network concept I’ll explain later. The final layer has three nodes, one for each species of iris. This is known as the output layer. When you run a particular flower through this neural network, each of the three nodes in the output layer will generate a probability that the flower is that particular species.
Now we need to configure the model. We need to tell it how to calculate the loss, how to adjust the weights, and what metrics to track. To calculate the loss, we’re going to use something called SparseCategoricalCrossentropy. That’s quite a mouthful, so let’s break it down. Cross-entropy is a common loss function for classification models. Categorical cross-entropy is what you need to use when you’re classifying into more than two categories. Since we’re classifying into three categories of irises, we need to use the categorical version. Sparse categorical cross-entropy means that the predicted category is represented by a single integer. In our case, that would be 0, 1, or 2, which each represent a different species of iris.
We also need to tell it how to adjust the weights after every pass. We do that by specifying an optimizer. In this example, we’re using one called adam. I won’t go into how it works, but I should mention that it optimizes models much more quickly than stochastic gradient descent, which is an optimizer that you’ll probably hear a lot about as you’re exploring machine learning.
Okay, now we can finally train the model. All we have to do is call the fit method and pass it the training dataset and the number of epochs. This is the number of times to run the whole training dataset through the model. So, we’re going to run all of the data through the model 200 times. For each of these 200 epochs, it runs all of the flowers through the model in batches of 32. After each batch, it calculates the average loss, updates the weights, and keeps track of its accuracy.
After it runs all of the data through 200 times, the model is trained, and it’s time to evaluate its effectiveness. We do this by running the test dataset through the model. The test data is the 20% of the original data that we held back, so the model hasn’t seen it before. This time, we run all of the test data through the model once and then print the overall accuracy of all of its predictions.
If it’s a good score, then we can be satisfied with our model and use it to classify new irises in the future. The script gives an example of how you can ask the model for predictions on new flowers. It runs the model on three new irises. Since there are only three new iris samples to classify, it just hardcodes the data in the script.
This is for demonstration purposes only. You wouldn’t normally have code to ask for predictions in the same script that trained the model. Instead, you would save the trained model and load it as a prediction service. The code to save it is really simple. It just needs to know where to save it, which we specify using the job-dir argument on the command line. Later, I’ll show you how to turn the saved model into a prediction service.
Okay, now that you’ve seen what the code is supposed to do, it’s time to run it. Go to the base of the GitHub repository. You can either download the repository as a zip file or do a “git clone” if you have git installed on your computer. I’m going to do a “git clone”. Then go into “aiplatform-intro/iris/trainer”. If you downloaded the zip file, then the base directory will have a “-master” at the end of it. The script we just went through is in a file called iris.py, so run it with “python3 iris.py --job-dir export”.
It prints the loss and accuracy for every epoch. You can see how the accuracy gets better over time. Then it says that the accuracy it achieved on the test data was 93.3%, which is really good. Notice that the accuracy on the test data is slightly lower than the final accuracy on the training data. This isn’t too surprising. What you need to watch out for is if the accuracy on the test data is dramatically lower than on the training data. That’s usually a sign of overfitting, which means that the model essentially memorized the training data and isn’t generalizable for new data. There isn’t an overfitting problem with this model, though.
Finally, it gives the predictions for the three new iris samples. It shows the likelihood for each of the three types of irises. The one with the highest number is the model’s best guess as to which type of iris it is. Its predictions for the three new flowers are setosa, versicolor, and virginica, which is what’s expected for these examples, so the model worked.
Great. You’ve successfully trained a neural network to classify irises. In the next lesson, I’ll explain deep neural networks.
Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).