Batch Normalization
Start course
1h 2m

Move on from what you learned from studying the principles of recurrent neural networks, and how they can solve problems involving sequencing, with this cohesive course on Improving Performace. Learn to improve the performance of your neural networks by starting with learning curves that allow you to answer the right questions. This could be needing more data or, even, building a better model to improve your performance.

Further into the course, you will explore the fundamentals around bash normalization, drop-out, and regularization.

This course also touches on data augmentation and its ability to allow you to build new data from your starting training data, culminating in hyper-parameter optimization. This is a tool to that aids in helping you to decide how to tune the external parameters of your network.

This course is made up of 13 lectures and three accompanying exercises. This Cloud Academy course is in collaboration with Catalit.

 Learning Objectives

  • Learn how to improve the performance of your neural networks.
  • Learn the skills necessary to make executive decisions when working with neural networks.

Intended Audience


Hello and welcome to this video on batch normalization. This video we will introduce batch normalization, which is a very powerful technique to regularize our models. Batch normalization is a technique that reduces the chances of overfitting by rescaling the features between one layer and the next in a deep network. Let's see how it works. First, we start with the values of the features coming from a particular batch of training data. These could be the values of the activations after the first layer or after the second layer or even the raw features themselves. We calculate the mean of such features and we calculate the variants. 

Then we rescale the features by subtracting the mean and dividing by the standard deviation. Notice that we use the small regularizer epsilon to avoid the division by zero. Finally, we scale and shift the normalized features with two numbers gamma and beta. These are learned by the network during the training. Batch normalization is a very recent technique and it has been shown to improve the performance of very large networks by a significant amount. In the coding session we will see that it also helps training to converge faster. In this video, we've seen that batch normalization enables higher learning rates, it regularizes the model, preventing overfitting and it improves the overall accuracy of the model. Thank you for watching and see you in the next video.

About the Author
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.