Evaluation Performance: Screenflow
Machine learning is a branch of artificial intelligence that deals with learning patterns and rules from training data. In this course from Cloud Academy, you will learn all about its structure and history. Its origins date back to the middle of the last century, but in the last decade, companies have taken advantage of the resource for their products. This revolution of machine learning has been enabled by three factors.
First, memory storage has become economic and accessible. Second, computing power has also become readily available. Third, sensors, phones, and web application have produced a lot of data which has contributed to training these machine learning models. This course will guide you to the basic principles, foundations and best practices of machine learning. It is advisable to be able to understand and explain these basics before diving into deep learning and neural nets. This course is made up of 10 lectures and two accompanying exercises with solutions. This Cloud Academy course is part of the wider Data and Machine Learning learning path.
- Learn about the foundations and history of machine learning
- Learn and understand the principles of memory storage, computing power, and phone/web applications.
It is recommended to complete the Introduction to Data and Machine Learning course before taking this course.
The dataset used in exercise 2 of this course can be found at the following link: https://www.kaggle.com/liujiaqi/hr-comma-sepcsv/version/1
Hello and welcome to this video on model performance. In this video, you will learn to define a baseline model to use a score to compare across models and to use train/ test split to judge how well your model is generalizing. In the previous video, we have trained our first supervised learning model and have found that the best combination of parameters b and w corresponded to the minimum value of cost. The question is how do we know if this is really a good model? Let's say we were to invest our money in a new house based on the prediction of our model. Would you trust the model just because the cost on the training data was small? Would you trust the prediction on the house that was not part of your training data set?
The question we are asking is, will our model generalize well when offered new unknown data. Let's see how we can answer that question. First of all, we need to establish a baseline or a super simple model that we are going to be using as reference. We also need to establish a score to compare our different models. Unfortunately, we cannot use the cost itself as a score because its value depends on the scale used to measure features and labels. So let's start by defining a better score. A commonly used score for regression is the R squared score. The R squared score compares the sum of the squares of residuals in our model with the sum of the squares in the baseline model that predicts the average price all the time. If the model is really good, the sum of the squares will be very small compared to the sum of the total squares and the fraction on the right will tend to be zero. In this situation, R squared will be close to one. On the other hand, if our model is no better than the baseline's, the two sums on the right will be close in value and the fraction will cancel out with the one, making our square close to zero.
Also, if our model is worse than the baseline, the term above the fraction will be bigger than the term below and the R squared coefficient will become negative. So to summarize, R squared close to one, good score, R squared lower than one, increasingly worse score, and when you're at zero or below, your model is doing worse than the simple model of using the average price. So now that we have a score and a baseline, let's see how we can check if the model is able to generalize well. Let's go back to our data set. What if instead of using all of it to train our model, we held out a small fraction of it, say 20%, of randomly sampled points. We could train the model on the remaining 80% and use the 20% to test how good the model is when it sees data that it has not seen before. This is called a train/ test split. We split the data into two sets, the training set and the test set, and use each according to its name.
We let the parameters of our model vary to minimize the cost over the training set and then check the cost or the score over the test set. If things went well, these two should be comparable, i.e., the model should be performing as well on the test set as it did on the training set. The fraction of the split does not need to be 20%. We could use 5%, 10%, 20%, 30%, or even 50%. Keep in mind that if you use too few data for testing, you may not have a credible test. And if you use too much data, you're making it very hard for your model to learn because it's exposed to only few examples. In conclusion, in this video, we learned to compare our predictions with the baseline model and use a normalized score, the R squared score, to judge how good a model is. We also learned how to split our data into training and test set to check how well our model is able to generalize. Now congratulations because you've just learned the basic ingredients of a neural network. You've learned about hypothesis, you've learned about cost, and you've learned about optimization. Thank you for watching and see you in the next video.
I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-founder at Spire, a Y-Combinator-backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.