Learn about the importance of gradient descent and backpropagation, under the umbrella of Data and Machine Learning, from Cloud Academy.
From the internals of a neural net to solving problems with neural networks to understanding how they work internally, this course expertly covers the essentials needed to succeed in machine learning.
Learning Objective
- Understand the importance of gradient descent and backpropagation
- Be able to build your own neural network by the end of the course
Prerequisites
- It is recommended to complete the Introduction to Data and Machine Learning course before starting.
Hey guys, welcome back. In this video, I'm going to show you how to visualize the activations of the inner layers. So this is something that is very powerful. You can use it as a way to understand what your model is doing, but also you can use it for dimensionality reduction of your data set, for example. So, we're still dealing with the bank notes data set, so we have four features in input. Notice that this time we are building a deep model. So we have two nodes in the first deep layer and then our output node. Okay, so, we fit the model for 20 epochs with the verbose equal to 1 and a validation split of zero point three. And, yes, our validation accuracy is improving, and our final score is... we print it. It's 99% on the test result. Okay, so, our model has 13 parameters, two layers, one inner layer that connects the four inputs to two inner nodes, and a second layer that connects those two inner nodes to the output node. Okay, so there we have two layers that are both density.
Now, what we are going to do is, we're going to take the first layer, this guy, and take its inputs assign it to a variable called "inp" and the output of the same layer. So basically we're taking the input that, the four inputs and the outputs are going to be the results of the "relu" activation function that we've defined... here. Okay. And we assign those two to variables. So, these are now tensor flow tensors. Okay? Remember to back end took care of instances of those so actually, all the objects are tensor flow operations in tenors and so the input tensor has size however many points we want and four inputs... or input features and the output node has as I said, the "dense_1/Relu" and zero stands for the tensor that comes out of and it has two variables. And then here's where the magic is happening. This is the trick. We define this feature function, which is K; stands for back end and it's a function between our input and our output. So we've defined this function. It's a keras backend tensorflow backend function.
And the cool thing is now we can apply this function to any data, okay? So we apply this function to the X test, see what happens, and this generates, you know, a nested array which we only care about the first element. So this is an array where we have as many data points as our input data sets... so put 412 for the test set, and two values. So we're reading the values of the nodes of the inner layer. We're reading of the two nodes in the inner layer. Okay, so we store that into "features" and what we can see is how our network is learning internally to separate the bank notes of one type from the other type. So, this is pretty cool, only five epochs our network learned to separate these. So... let's see, I've take the features, first column, second column, and I'm plotting them with color using the labels. Okay, great. I'm going to reset the model and build an even deeper model now it's going to have three nodes in the first inner layer, then two nodes in the second inner layer, and then one node in the output layer.
And I'm going to again take of this model the input of the first layer and the output of the second layer so this layer here. So my function is again, going to be a function between four input features and two output features, only this time the function will go through the inner layer with three nodes as well. So define the features function and then what I do is I loop through the epochs, see I'm doing just one epoch of fitting, and I want to see basically I want to see the evolution of my training, how the inner layer is changing it's understanding of the features. And I'll print the test accuracy as a title so let's see. See, in this case the way the model was understanding the features at that inner node, at the beginning they were overlapping but over time... epoch three, epoch four, epoch five... epoch six, epoch seven, epoch eight... The model is basically learning to separate the two really well. Now, if we re-run this, the figures will look different because of the initialization and so, let's reinitialize the model.
Okay, so it's running again and this will look different. Hopefully, and... yeah! So, again, the model is learning to separate the two... the two classes in this inner space. I'll do it once more, just to see that it's gonna get to a different conclusion again, but still it's going to learn. So, in other words, their presentation of inner layers is useful to see how the model is understanding our data. Alright, so our model has learned to really separate one class from the other class. As you can see, you can see it's consistently doing this over and over again. So I incite you to change the model, like change the number of nodes, change the number of layers, and again, define a new function that always takes, reads the value out of the layer with two nodes and then plot them and see what happens.
See what the results look like. So, hope you had fun with this little trick of how to visualize the activations of inner layers, we're going to be using this in the exercises, so thank you for watching and see you in the next video.
I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-founder at Spire, a Y-Combinator-backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and UniversiteĢ de Paris VI and graduated from Singularity University summer program of 2011.