Training Your First Neural Network
Scaling Up with ML Engine
The course is part of these learning paths
Machine learning is a hot topic these days and Google has been one of the biggest newsmakers. Recently, Google’s AlphaGo program beat the world’s No. 1 ranked Go player. That’s impressive, but Google’s machine learning is being used behind the scenes every day by millions of people. When you search for an image on the web or use Google Translate on foreign language text or use voice dictation on your Android phone, you’re using machine learning. Now Google has launched Cloud Machine Learning Engine to give its customers the power to train their own neural networks.
If you look in Google’s documentation for Cloud Machine Learning Engine, you’ll find a Getting Started guide. It gives a walkthrough of the various things you can do with ML Engine, but it says that you should already have experience with machine learning and TensorFlow first. Those are two very advanced subjects, which normally take a long time to learn, but I’m going to give you enough of an overview that you’ll be able to train and deploy machine learning models using ML Engine.
This is a hands-on course where you can follow along with the demos using your own Google Cloud account or a trial account.
- Describe how an artificial neural network functions
- Run a simple TensorFlow program
- Train a model using a distributed cluster on Cloud ML Engine
- Increase prediction accuracy using feature engineering and both wide and deep networks
- Deploy a trained model on Cloud ML Engine to make predictions with new data
- The GitHub repository for this course is at https://github.com/cloudacademy/mlengine-intro.
- Nov. 16, 2018: Updated 90% of the lessons due to major changes in TensorFlow and Google Cloud ML Engine. All of the demos and code walkthroughs were completely redone.
About the Author
Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).
Machine learning is a hot topic these days. It’s constantly in the news. Here’s a small sample of some recent headlines.
It may be hard to believe that machine learning could be transforming everything from cybersecurity to customer service to music, but its benefits really are being explored in nearly every industry imaginable. All of this excitement makes machine learning sound almost magical. But if you look under the hood, it’s actually a rather simple concept, although implementing it can be very complex.
The idea is that you feed lots of real-world data into a program and the program tries to make generalizations about the data. It then uses these generalizations to make predictions when it’s given new data. For example, after looking at lots of x-rays of patients with and without cancer, it can then analyze an x-ray of a new patient and predict whether or not the patient has cancer. Or, for a less serious example, it can look at movies you’ve watched in the past and predict which new movies you would like to watch now.
Quite often, machine learning is used for a task that doesn’t really sound like prediction, but it still is, in a way. For example, it could be used to look at a picture and say whether or not the picture contains a cat. That sounds more like identifying or classifying an object, but from the machine’s point of view, it’s a prediction because it doesn’t know for certain whether or not the picture contains a cat. Maybe “guess” is a better word for what it’s doing in this case.
Okay, so you feed the program data and it tries to be a good guesser. So, how does it do that? There are many different approaches, but the one we’re going to focus on in this course is the neural network.
Suppose you work in the property tax department of a city and you want to create a machine learning model that will predict how much a particular house would sell for, so you can make an accurate assessment of the home’s value. First, you have to decide which “features” of the home are important. Square footage is obviously important, but there are many other factors that will affect its value, such as the age of the home, how close it is to schools, whether it has a swimming pool or not, etc.
You should also decide which features are not important. For example, you might think that the color of the house doesn’t matter (although there are some studies that suggest it can have a minor impact on selling price, so deciding which features not to include can be trickier than it seems).
You might want to include every feature, just in case it helps with the prediction, but the more features you use, the longer it will take for the neural network to experiment with them all.
Alright, so you’ve chosen your features, now what does a neural network actually do with them? The simplest neural network works like this. Each feature of the house is assigned to a neuron (also known as a node). Then you apply a weight to each of the nodes, which says how important each feature is when determining the house price. You could make an initial guess for each of them or you could just assign random weights.
After that, the network inputs the first example home’s features and multiplies each feature by its weight. Then the output node adds everything up, and compares its prediction to the actual selling price of this home. Then it does the same thing with a bunch of other example houses and calculates the average error for all of them. Now, it tries to adjust the weights so it will get a lower error.
Here’s the whole process. It sets the initial weights and runs a batch of homes through them. Then it adjusts the weights to try to minimize the error, and runs another batch of homes through the new weights to see what happens to the error. It keeps going through this loop as many times as you tell it to, and hopefully, it will come up with a combination of weights that minimizes the average error. By the way, you’ll usually see the average error referred to as the loss, but it’s the same thing.
Once it’s done going through this process, it checks its final weights against a new set of homes to verify the accuracy of this model.
Most explanations of neural networks are far more complicated than this, and of course, most neural networks really are more complicated than this, but the basic idea is still the same.
I should point out one more detail, though. Most neural networks include something called “bias”. With bias, the output of a node is not just the sum of the weights and values multiplied together. It’s the sum of the weights times the values plus the bias. Why would you need to add a bias? Well, let’s say that we had a really simple model that only looks at the square footage feature. Suppose that 1,000-square-foot homes in a city are worth $200,000 and 10,000-square-foot homes are worth $1.1 million.
If you tried to model this relationship with just a weight times the square footage, then it wouldn’t work because the line has to go through 0, 0. To get the correct line on the graph, you have to add a constant. That’s called the bias in the equation.
When the machine learning algorithm adjusts the weights to try to reduce the average error, it also adjusts the bias.
Okay, now we’re ready to build an actual neural network using TensorFlow in the next lesson.