Start course
1h 2m

Move on from what you learned from studying the principles of recurrent neural networks, and how they can solve problems involving sequencing, with this cohesive course on Improving Performace. Learn to improve the performance of your neural networks by starting with learning curves that allow you to answer the right questions. This could be needing more data or, even, building a better model to improve your performance.

Further into the course, you will explore the fundamentals around bash normalization, drop-out, and regularization.

This course also touches on data augmentation and its ability to allow you to build new data from your starting training data, culminating in hyper-parameter optimization. This is a tool to that aids in helping you to decide how to tune the external parameters of your network.

This course is made up of 13 lectures and three accompanying exercises. This Cloud Academy course is in collaboration with Catalit.

 Learning Objectives

  • Learn how to improve the performance of your neural networks.
  • Learn the skills necessary to make executive decisions when working with neural networks.

Intended Audience


Hello and welcome to this video on dropout. In this video, we will introduce the technique called dropout and explain the effect of dropout on a fully connected network. Dropout is one of those counterintuitive techniques in machine learning and it's also quite recent. Let's take a fully connected neural network. If this network is very large, it has a lot of parameters. And therefore, the risk of overfeeding is pretty high. We can reduce that risk by randomly killing some of the nodes for one training iteration and then killing some other nodes in the next iteration. More formally, this means introducaing a probability p of being present for each node and only use that probability at training time. While at test time, the node will always be present. This random disruption forces the network to learn more representative features by building redundancy in its connections.

In other words, each node cannot completely rely on any other nodes on the network to correct the errors it makes because these other nodes could randomly disappear at any point of the training. Therefore, the node must learn to do its job properly. Dropout has been shown to improve the performance of a network on several datasets. And also, features with dropout look more specific and better defined because it cannot rely on other parts of the network to solve their problems. Dropout has only one hyper parameter, the probability p of retaining a node. This corresponds to the average fraction of nodes that are active in a layer during training. In this graph, we can see that the probabilities between 0.4 and 0.7 the behavior is pretty stable. We test errors smaller than when all the nodes are retained. In other words, dropping 30 to 60% of the nodes in layer should improve the performance of that layer in a large network. In this video, we've introduced the technique of dropout and showed how it regularizes a network. We've also talked about its hyper parameter which is the probability of retaining a certain amount of nodes. Thank you for watching and see you in the next video.

About the Author
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.