Exercise 1: Solution
Start course
1h 19m

In this course, discover convolutions and the convolutional neural networks involved in Data and Machine Learning. Introducing the concept of tensor, which is essential for everything that follows.

Learn to apply the right kind of data such as images. Images store their information in pixels, but you will discover that it is not the value of each pixel that matters.

Learning Objectives

  • Understand how convolutional neural networks are essential to the fundamentals of Data and Machine Learning.

Intended Audience


Hey guys, welcome to the solutions of the exercises. In exercise one, we are asked to essentially extend our MNIST model with the deeper architecture. So we load some standard things. By now you're very familiar with all of them, so, we'll not go to them. And here's how we are going to build the model. First of all, we load MNIST and we reshape it, and rescale it, so, remember, the shape of X_train is this, so we need to reshape so that it's a tensor of order four. We also re-scale it to be between zero and one. And we change our labels to categorical with 10 classes. So this, we've all seen before. So here is the model. First, we build a convolutional 2D, 32 filters, three by three. This time, I put the activation here, so I can show you how it's done this way. And we stack it with the two by two max pooling layer

Notice that since this was the first layer, I had to specify the input shape. Then, we stack on top of this pancake another little stack, with 64 three by three filters, same activation, relu, and a max pooling layer. So, after the max pooling, we have the flatten layer, the dense, fully connected layer, and the final output of 10. So, essentially, it's the same model we've built before, the only change is we've introduced an additional convolutional layer and an additional max pooling layer. So, our model will look like this. We have a first layer, convolutional, a second layer, convolutional. Notice that we have a lot more parameters here because we are going to 32 filters to 64 filters. Whereas before, in the beginning, we were going from one input channel to 32 input channels. Then, we have a second pooling layer, finally, a flatten, we go to 1600, and then we have two fully connected jumps, and a final output of 10. Notice that the total number of parameters of this network is actually less than the network that didn't have this layer, which is expected as you've seen when we counted the parameters. Each convolutional layer has less free parameters than a corresponding fully connected layer. 

So, this network is more powerful with less parameters. So I'm running the feed, and as you can see, the model is running. It's fitting. I've ran it for just two epochs, with a validation split of 0.3, a batch size of 128, and when it's done, we'll check the performance. So the model is done, and let's check the testing set. The performance on the test set is hopefully higher than what got before. Yes, we get to 98.24% accuracy on the test set, which is the highest score I got with this model. So, very good. If we make the model more complex, or we train it for longer, we are probably going to be able to push it even higher. So I'll let you do that and see how high you can get. Feel free to post it on the forum, brag about your score, or ask for feedback if you got a lower score. Thank you for watching, see you in the next video for exercise two.

About the Author
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.