Classification: Keras
Start course
2h 4m

Machine learning is a branch of artificial intelligence that deals with learning patterns and rules from training data. In this course from Cloud Academy, you will learn all about its structure and history. Its origins date back to the middle of the last century, but in the last decade, companies have taken advantage of the resource for their products. This revolution of machine learning has been enabled by three factors.

First, memory storage has become economic and accessible. Second, computing power has also become readily available. Third, sensors, phones, and web application have produced a lot of data which has contributed to training these machine learning models. This course will guide you to the basic principles, foundations, and best practices of machine learning. It is advisable to be able to understand and explain these basics before diving into deep learning and neural nets. This course is made up of 10 lectures and two accompanying exercises with solutions. This Cloud Academy course is part of the wider Data and Machine Learning learning path.

Learning Objectives

  • Learn about the foundations and history of machine learning
  • Learn and understand the principles of memory storage, computing power, and phone/web applications

Intended Audience

It is recommended to complete the Introduction to Data and Machine Learning course before taking this course.


The datasets and code used throughout this course can be found in the GitHub repo here.



Hey guys, welcome back. In this video I'll show you how to perform a classification with keras. So let's load some data for our classification problem. I'll load this user_visit_duration data which has only two columns. The time spent on the website and the purchase behavior, it's either a zero or a one. So let's plot it and you can see that this is the time spent on the website and this is the buying behavior. These are all people who bought, this guy was particularly decisive, maybe he had seen the product already, choose very quickly. But in general, there seem to be longer time, slightly longer time for the people who decide to buy than for the people who don't buy. So, can we train a model to distinguish between these two classes, the buyers and the non-buyers? And the answer is yes, we can do that by defining a logistic regression. So, a logistic regression is defined in keras in exactly the same way by starting with the sequential model, adding a dense layer with one input and one output, and the only difference is we add an activation function that is the sigmoid function, as you've seen in class. So let's build this model and compile it. Notice there's in this case the loss we are using is a different loss, it's the binary cross entropy. We don't use the mean squared error for classification but we use a binary cross entropy. 

Also notice that in this case, at compilation time, we are passing a metric of accuracy. Perfect. So let's check the model summary. So if I do model.summary and display. It has two parameters as I expected, it has one weight and one bias, and it's one dense layer, so it looks exactly the same as the previous model, but we know that this time we've included the sigmoid function in the activation of the only layer which makes it so that the output would be constrained between zero and one. Now we take our input features and our output values and we fit the model for 25 epochs. Okay, so let's see what happens. It was very quick because we don't have that much data, and you see the loss is going down again while the accuracy seems to be going up, although not really that much. It's kind of going up but not so much. Okay, let's see the model that was found. So we can plot, as a scatter plot, our data. And then define a temporary linear space between zero and four. And plot the prediction of our model on our temporary linear space. We do this to have a smoother curve. And you can see that it's a smooth probability increasing from zero, here, to one. Okay, so let's see how the classes are predicted. We can go from the predicted probabilities to the predicted classes by looking at the predicted probabilities greater than 50%. So we predict on our temporary axis that we've generated for the plot and these are the temporary classes for the values of time. So temp, remember, is this linear space. Okay, so we can plot this with the same code as before and what's it gonna look like is Bam. We have all the points up to this one are predicted in class zero and all these other points are predicted in class one. Although as you saw before, the points near here have really like a small probability difference of being in one class or the other, whereas if you're here, you're practically certain to be in class one, so to be a buyer. Okay, the last thing we're gonna do is check the accuracy of our model. 

So we're going to predict the probabilities for the actual values of our data, for x, and then predict the classes by doing, checking the predictions greater than five. Then we import the accuracy_score from sklearn.metrics and we calculate the score, it's almost 80% which given how the dataset is really overlapping, is not a bad score. We cannot possibly get 100% with this type of model, because it will always look for a step-like behavior. Okay, so we've trained a model, a logistic regression model on a dataset with only one feature in the input and obviously a binary label in output. And the model is not performing that bad, it got to 80% accuracy. Great, so the next thing we're gonna do is train/test split on our classification data. This is very similar to what we did before, we divide our data into train and test sets, we get the weights from the model and we reset them, notice that we are doing this in a better way than what we did before, manually iterating over all the parameters obtained from get weights and by setting them to zeros with an array of the same shape of the weight. Finally, we set the weights using the new parameters. Okay, so now that we've reset the weights, the accuracy score is really bad because the model is not good anymore, we've reset it. Now let's train the model for 25 epochs, great. And let's check, notice that we've trained it on the training test set only. Now we check to accuracy score for the training and the test set for the predictions that are greater than zero and notice that we are actually doing even better on the test set than we are doing on the training set. This is probably due to the fact that our dataset is pretty small, but in any case, not bad. So in this video, we've trained the classification model using a logistic regression and we've check the performance of the model using a train/test split. Thank you for watching and see you in the next video.

About the Author
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.