The course is part of this learning path
Recurrent Neural Networks
From the internals of a neural net to solving problems with neural networks to understanding how they work internally, this course expertly covers the essentials needed to succeed in machine learning.
This course moves on from cloud computing power and covers Recurrent Neural Networks. Learn how to use recurrent neural networks to train more complex models.
Understand how models are built to allow us to treat data that comes in sequences. Examples of this could include unstructured text, music, and even movies.
This course is comprised of 9 lectures with 2 accompanying exercises.
- Understand how recurrent neural network models are built
- Learn the various applications of recurrent neural networks
- It is recommended to complete the Introduction to Data and Machine Learning course before starting.
About the Author
I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-founder at Spire, a Y-Combinator-backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.
Hello, and welcome to this video on time series. In this video, we will talk about time series and how to do machine learning with time series. Time series are everywhere. Examples of time series are the values of the stock in the stock market, or music, which is a sequence of sounds, text is a sequence of words, or the events coming from your app. A video game, too, is a time series because it's a sequence of actions in an environment. And in general, any quantity that is monitored over time and generates a sequence of values is a time series. A time series is an ordered sequence of data points. A univariate time series is a sequence of simple numbers. Examples of these are temperature values through date or the number of times per minute your app was downloaded. But a time series could also take values in a vector space. In this case, you would represent it with multiple univariate time series, one for each vector component. Examples of vector time series are the speed of a car as a function of time or an audio file recorded in stereo because it has two channels. Machine learning can be applied to time series to solve several problems, including forecasting, pattern recognition, and anomaly detection. In addition to these, sometimes preprocessing is required.
And there are specific techniques to remove noise from a time series. Let's look at some of the typical machine learning problems with time series. Let's start with forecasting. Given a sequence of values, we want to predict the future values in the sequence. In a way, this problem is like a regression problem because we are predicting a continuous quantity using features derived from the time series. Most likely though, this is a non-linear regression, because otherwise, it would be a very trivial sequence that we are trying to predict. A second problem is that of identifying anomalies. Given a sequence with regular behavior, identify where it deviates from the regularity. This problem can be approached both from a supervised learning perspective if we know the anomalies we are looking for, or more interesting, from an unsupervised learning perspective. In this case, we would just train a model to forecast future values. Then, we would compare the predicted value with the actual signal and consider anomalies the locations where the model prediction is very different from the actual signal. Finally, we can also perform classification on time series and identify regions of recurring patterns. This too can be a supervised problem or an unsupervised learning problem. In all these cases, we must use particular care because the data is ordered in time, and we need to avoid leaking future information in the feature used by the model.
This is particularly true for model validation. If we split the time series into training and test sets, we cannot just pick a random split from the time series values. We need to split the data in time with all of the training data happening before the split and all of the test data happening after the split. If we don't do this, we would be training the model with future information which is wrong. Finally, sometimes a trend or a periodic pattern is clearly distinguishable in our time series. This is particularly true with any data related to human activity where daily, weekly, monthly, and yearly periodicities are found. Think, for example, of retail sales. A dataset with hourly sales from a shop will have regular patterns during the day with periods of higher customer flow and periods of lower customer flow as well as during the week with moments and days with more people and moments and days with less people. Depending on the type of goods, we may find higher or lower sales during the weekend or during the week. In these cases, it is a good idea to either remove these periodicities beforehand or add the relevant time intervals, the days, the weeks, or the month indices as input features too. In conclusion, in this video we talked about time series and typical problems that we may encounter when doing machine learning on time series. We also explained that caution is needed when doing machine learning with time series, not to confuse past values and future values. Thank you for watching, and see you in the next video.