1. Home
  2. Training Library
  3. Big Data
  4. Courses
  5. Getting Started with Deep Learning: Introduction To Machine Learning

Confusion Matrix

Developed with
Start course
2h 4m

Machine learning is a branch of artificial intelligence that deals with learning patterns and rules from training data. In this course from Cloud Academy, you will learn all about its structure and history. Its origins date back to the middle of the last century, but in the last decade, companies have taken advantage of the resource for their products. This revolution of machine learning has been enabled by three factors.

First, memory storage has become economic and accessible. Second, computing power has also become readily available. Third, sensors, phones, and web application have produced a lot of data which has contributed to training these machine learning models. This course will guide you to the basic principles, foundations and best practices of machine learning. It is advisable to be able to understand and explain these basics before diving into deep learning and neural nets. This course is made up of 10 lectures and two accompanying exercises with solutions. This Cloud Academy course is part of the wider Data and Machine Learning learning path.

Learning Objectives

  • Learn about the foundations and history of machine learning
  • Learn and understand the principles of memory storage, computing power, and phone/web applications.

Intended Audience

It is recommended to complete the Introduction to Data and Machine Learning course before taking this course.


The dataset used in exercise 2 of this course can be found at the following link: https://www.kaggle.com/liujiaqi/hr-comma-sepcsv/version/1


Hey guys, welcome back. In this video we will perform the Confusion Matrix in Python. So, first of all, we need to load the Confusion Matrix function from scikit learn metrics. So we do that, and then we can just execute the Confusion Matrix function on the actual classes and the predicted classes. So, this is giving us 34 true negatives, 45 true positives, and 16 false positives, and five false negatives. So, how do I know this? Well I can check the documentation hitting shift up twice, and I can see that it explains that for example, the count of true negatives is in the element zero zero. The element zero zero being the first one 'cause rows are zero, one, two, et cetera. And columns are zero, one, two, et cetera. Okay, but this doesn't look really pretty so let's define a helper function that will help us prettify the Confusion Matrix. It will have the same input y true and y predicted, but also takes some label array which for default is set to be false and true. And what we do is we take the Confusion Matrix, this one, and insert in a pandas data frame. This way we can set the labels and the columns. 

So the index and the columns. Long story short, we have inserted the Confusion Matrix values in the much prettier structure that of a pandas data frame. This is much easier to read now. I have also set the labels to be not buy and buy so now I now that these are people who didn't buy and we predicted them didn't buy, so there's our true negatives. And these are true positives and so on. Finally, we can use the other metrics from scikit learn for example, the precision score the recall score, and the F1 score. And all of them have the same type of signature. They take the true labels and the predicted labels and they calculate the score. So we can print them, we see that precision it's about 74%, recall is higher 90%, and so the F1 score which is kind of a combination of the two is somewhere in between the two. The classification score is another metrics that prints out all these things at once. So we can print the classification report and so we can see that we have the precision, the recall, and the F1 score for the zero class and the precision, recall, and F1 score for the one class. These are the ones we just calculated. Also we have the average and the total number of points. These are 50 in the zero class and 50 in the one class. This is it for our classification, thank you for watching and see you in the next video.

About the Author
Francesco Mosconi
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.