Vanilla RNN
Start course

From the internals of a neural net to solving problems with neural networks to understanding how they work internally, this course expertly covers the essentials needed to succeed in machine learning.

This course moves on from cloud computing power and covers Recurrent Neural Networks. Learn how to use recurrent neural networks to train more complex models.

Understand how models are built to allow us to treat data that comes in sequences. Examples of this could include unstructured text, music, and even movies.

This course is comprised of 9 lectures with 2 accompanying exercises.

Learning Objective

  • Understand how recurrent neural network models are built
  • Learn the various applications of recurrent neural networks




Hello and welcome to this video on vanilla recurrent neural networks. In this video, we will introduce the simplest form of recurrent neural network and we'll also explain how that can be made deep. As we saw, recurrent neural networks maintain an internal state by using their own output as part of the input for the next prediction. Let's see how we could implement a simple one. The simplest recurrent neural network can be viewed as a fully connected neural network if we unroll the time axis. The output value at times t is obtained with the function h sub t equals the hyperbole tangent of w h sub t minus one plus u times x sub t. 

Notice in this univariate case, only two weights are involved, the weight multiplying the current input x sub t, we called it u, and the weight multiplying the previous value of the output, h sub t minus one, we called this weight w. By the way, doesn't this formula remind you of the exponentially weighted moving average? It's not exactly the same because there's a tanh and the two waves are independent, but it does look similar in that it mixes past values of the output with current values of the input. And I had told you you would encounter that formula again and again and again. You'll meet it again, promise. 

Also, notice that the weights do not depend on time. So the network is learning the best values of its two weights, which are fixed in time. We can build deep recurrent neural networks by simply stacking recurrent units onto one another. We feed the input to a first layer, to a first unit, and then feed the output of that layer onto a second layer and so on. This makes it very easy. The simple recurrent neural network works well only for short-term memory. But as we will see, it suffer from a fundamental problem if we try to have a longer time dependency. In conclusion, in this video, we learned how to unroll time to build a recurrent neural network and we built both a shallow and a deep version of such a simple network. Thank you for watching and see you in the next video.

About the Author
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.