CloudAcademy
  1. Home
  2. Training Library
  3. Big Data
  4. Courses
  5. Getting Started With Deep Learning: Working With Data: Gradient Descent

Learning Rate

Developed with
Catalit
play-arrow
Start course
Overview
DifficultyBeginner
Duration1h 45m
Students113

Description

Learn about the importance of gradient descent and backpropagation, under the umbrella of Data and Machine Learning, from Cloud Academy.

From the internals of a neural net to solving problems with neural networks to understanding how they work internally, this course expertly covers the essentials needed to succeed in machine learning.

Learning Objective

  • Understand the importance of gradient descent and backpropagation
  • Be able to build your own neural network by the end of the course

Prerequisites

 

Transcript

Hello, and welcome to this video on the learning rate. In this video, we will talk about the learning rate. When we perform our update step, we update the value of w to w minus the derivative of the cost. Doing this, we move by a quantity determined by the rate of change in the cost with respect to the weight, w. This could be a problem in two ways. If the cost function is very flat, we will move very, very slowly towards the minimum, and vice versa, if the function is very steep, we might end up jumping beyond the minimum. The solution to both problems is to introduce a tuneable knob that allows us to decide how big of a step to take in the direction of the gradient. This is called learning rate, and it is usually indicated with the letter alpha. If we choose a small learning rate, we will move by tiny steps.

 If we choose a large learning rate, we will move by large steps. However, we must be very careful, because if the learning rate is too large, we will actually run away from the solution. At each new step, we move towards the direction of the minimum, but since this step is too large, we overshoot and go beyond the minimum, at which point, we reverse course and repeat, going further and further away. In this video, we've introduced the tuneable parameter called learning rate, and we've learnt that it's necessary to tune it in order for our algorithm to optimally converge to the minimum cost. Thank you for watching and see you in the next video.

About the Author

Students745
Courses8
Learning paths3

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.