Learning Curves Continued
Start course
1h 2m

Move on from what you learned from studying the principles of recurrent neural networks, and how they can solve problems involving sequencing, with this cohesive course on Improving Performace. Learn to improve the performance of your neural networks by starting with learning curves that allow you to answer the right questions. This could be needing more data or, even, building a better model to improve your performance.

Further into the course, you will explore the fundamentals around bash normalization, drop-out, and regularization.

This course also touches on data augmentation and its ability to allow you to build new data from your starting training data, culminating in hyper-parameter optimization. This is a tool to that aids in helping you to decide how to tune the external parameters of your network.

This course is made up of 13 lectures and three accompanying exercises. This Cloud Academy course is in collaboration with Catalit.

 Learning Objectives

  • Learn how to improve the performance of your neural networks.
  • Learn the skills necessary to make executive decisions when working with neural networks.

Intended Audience


Hello and welcome back. In this video, we're going to plot some learning curves for a dataset of digits. It's not the mnist digits dataset, it's not the mnist dataset, we're going to use a smaller size dataset of digits so that our algorithms will perform faster and we are going to be able to repeat many iterations of the training. So, let's go and do it. We start by loading the usual packages and then we import this digits dataset from sklearn, so the load_digits function will load this dataset and as you can see, they are again, black and white images and the shape is eight by eight, so let's plot a few of them. They are already arranged as you can see in a 64-array vector but if we reshape them to be images of eight by eight as you by now should be very familiar with, we see that we're plotting the digits one, two, three, four, five, six, seven. 

So, it's very similar to the mnist dataset. Only the resolution of this dataset is eight pixel by eight pixels instead of 28 by 28 which will make all of our training algorithms run much faster and we load some very familiar classes from Keras and we are good to go. To clear the session, we build our first model. Our first model is going to be a fully connected model with 16 nodes in the first inner layer, 64 input nodes and 10 output nodes with a softmax. Also, notice that we want to store the initial weights and we do this because since we are going to run the training multiple times with different train sizes for the learning curve, we want to make sure that the model get reinitialized with the exact same weight each time. So, we get the weights from the model and we store them in a variable called initial_weights. We convert our labels to categorical and we perform our usual train_test_split with a test size of 30%. Then we create an array of train sizes and the way we do it is by multiplying the length of X_train, the training dataset by the linear space of four regularly spaced points between 10% and 99.9% of the total number of points. Then we convert it to an integer and we have our four train sizes. 

So, just for comparison, the X_train shape is 1,257, so we're going to train with 125 points, 502 points, 879 points, and then 1,255. The next step is all enclosed in this loop and I'll read the content of this loop step by step and I'll make the window a little smaller so that you can read more easily. So, what are we doing? For train_size in train_sizes, remember that train_sizes are these four. We perform a second train_test_split of the training set with the train_size equal to the train_size we've chosen and we call the split_fraction that we're going to use X_train_fraction. We don't need the test, so we discard it and then y_train_fraction and we don't need the y_test. But just to be clear, we perform the first train_test_split and we set aside a part of our data and now in the loop, we're going to perform other train_test_splits with increasing fractions of the training set. So, we perform the train_test_split, we set the weights, we fit our model with the EarlyStopping callback and 300 epochs maximum, then we evaluate the model on the training_fraction and we append the scores to our empty list of train_scores. Then we also evaluate the model on X_test and y_test and we append the scores to the test_scores list. 

So, let's do it. It'll take a while. So, training is done. Let's look at the results. This is our test score as a function of the amount of data we're using for training and this is our training score, so as you can see, the training score is already perfect, it's 100% but the test score goes up significantly as we cross from the first small training_size to the larger 500 point size. So, there's probably still room for improvement, so if I had to make a choice between improving the model and getting more data, I would opt for getting more data in this case because the curve seems to be wanting to go even higher. So, this is how we use a learning curve. It's a very useful tool, so make sure you use it. Sklearn also provides a learning curve function that you may want to experiment with. It does not play very well with Keras and that's why we didn't use it this time but check it out and tell us what you found. Thank you for watching and see you in the next video.

About the Author
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.