Training Your First Neural Network
Scaling Up with ML Engine
The course is part of these learning paths
Machine learning is a hot topic these days and Google has been one of the biggest newsmakers. Recently, Google’s AlphaGo program beat the world’s No. 1 ranked Go player. That’s impressive, but Google’s machine learning is being used behind the scenes every day by millions of people. When you search for an image on the web or use Google Translate on foreign language text or use voice dictation on your Android phone, you’re using machine learning. Now Google has launched Cloud Machine Learning Engine to give its customers the power to train their own neural networks.
If you look in Google’s documentation for Cloud Machine Learning Engine, you’ll find a Getting Started guide. It gives a walkthrough of the various things you can do with ML Engine, but it says that you should already have experience with machine learning and TensorFlow first. Those are two very advanced subjects, which normally take a long time to learn, but I’m going to give you enough of an overview that you’ll be able to train and deploy machine learning models using ML Engine.
This is a hands-on course where you can follow along with the demos using your own Google Cloud account or a trial account.
- Describe how an artificial neural network functions
- Run a simple TensorFlow program
- Train a model using a distributed cluster on Cloud ML Engine
- Increase prediction accuracy using feature engineering and both wide and deep networks
- Deploy a trained model on Cloud ML Engine to make predictions with new data
- The GitHub repository for this course is at https://github.com/cloudacademy/mlengine-intro.
- Nov. 16, 2018: Updated 90% of the lessons due to major changes in TensorFlow and Google Cloud ML Engine. All of the demos and code walkthroughs were completely redone.
In the last lesson, we used the DNNClassifier to create a deep neural network. So, what makes a deep neural network different from a regular one and why would you need to use it? Although “deep learning” sounds like it must be a complex, almost mystical concept, really it just means that the neural network has more than three layers, like this.
The layers in between the outside ones are called hidden layers because their inputs and outputs are not visible. Hidden layers are useful because they allow the network to combine features to recognize higher-level patterns.
Returning to our home value estimator, suppose that in some neighborhoods, older homes are more highly valued than newer homes, but in other neighborhoods, newer homes are more highly valued than older homes. In a two-layer neural network, all of the features are independent from each other, so there’s no way to combine a home’s age and its neighborhood to determine how much the value should increase or decrease.
Well, actually there is a way, but it requires a lot of insight on the part of the person building the network. You could create new features that are combinations of the original features and include them at the input layer. We’ll go over this approach in another lesson.
The great thing about deep networks is that they can often discover these relationships for you. That’s because each node in a hidden layer combines the outputs from all of the nodes in the previous layer in different ways.
The result is that a node in a hidden layer can potentially become a “feature detector”. There’s a nice illustration of this on the deeplearning4j website This shows a neural network that’s trying to detect a face in an image. In the first layer, it detects edges. In the second layer, it combines these edges to detect simple shapes. In the third layer, it combines the simple shapes to detect facial shapes. If you had a network with no hidden layers, then it would have to try to detect a face based on only the individual pixels in the image, which would be much more difficult.
The beauty of this is that it discovers these features itself without any guidance from the person building the neural network. This is what can make deep learning seem almost magical.
Now, going back to the iris classifier, let’s have a look at how its hidden layers were defined. The “hidden_units” argument is set to 10, 20, 10, which means there are 10 nodes in the first hidden layer, 20 in the second, and 10 in the third. The network looks like this. I didn’t draw the lines in between the nodes because it would have been 470 lines. I hope you don’t mind.
So, how much of a difference do the hidden layers make? Well, the easiest way to find out is to remove the hidden layers and see how much lower the accuracy is. Change the DNNClassifier to the LinearClassifier. The two classifiers are essentially the same except that LinearClassifier doesn’t have any hidden layers. Then remove the “hidden_units” line. Now run it again.
The accuracy is exactly the same. How can that be? Well, this particular classification problem is so simple that it doesn’t require hidden layers to discover new features. That’s not usually the case with more complex classification problems, but it does show that hidden layers won’t always improve your models.
And that’s it for this lesson.
About the Author
Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).