Exercise 2: Solution
Start course
1h 13m

Continue the journey to data and machine learning, with this course from Cloud Academy.

In previous courses, the core principles and foundations of Data and Machine Learning have been covered and best practices explained. 

This course gives an informative introduction to deep learning and introducing neural networks.

This course is made up of 12 expertly instructed lectures along with 4 exercises and their respective solutions.

Please note: the Pima Indians Diabetes dataset can be found at this GitHub repository or at Kaggle page mentioned throughout the course.

Learning Objectives

  • Understand the core principles of deep learning
  • Be able to execute all factors of the framework of neural nets

Intended Audience





Hello, and welcome back. In exercise two, we get to build our model. So, we are asked to build a fully connected neural network model that predicts diabetes from our data. And we're given a bunch of steps to follow. The first step is gonna be to split our data into a train and test set with a test size of 20% and a random state of 22. Then, we are asked to define our sequential model, the deep learning model, with at least one inner layer, so make it deep. You can build as many layers as you want, you can add as many layers as you want but at least one. And, there are a bunch of choices that we have to make. What's the size of the input? We need to have as many input nodes as the number of features. How many nodes are we going to give to each layer? The inner layer, the output layer, so what's the size of the output.

 What activation functions we're going to use in the inner layers and what activation functions are we going to use in the output, right? What loss are we going to use, and what optimizer. So, a bunch of choices that we have to make for this model, and then once we've made them, and we've built the model and compiled it, we're going to train it on the training set using a validation split of 10%, test the model on the test set from the train/test split data, so this 20% that we left out. And finally, we check the accuracy, the confusion matrix, and the classification report. Okay, so let's check our input data first. We have 768 data points and eight features each. So we split them into training and test set so we split both the features and the labels. Random state equals 22, test size equal 20%. Next, we import the sequential, we are told that we're gonna use the sequential model. The dense layer, this is the fully connected object for the fully connected layer and an optimizer, I'll import Adam. 

In the next chapters, we'll talk about what the differences between the different optimizer but Adam is a pretty good optimizer so I'll choose that. And then here's how I build my model. I have an input shape of eight because I have eight features, 32 nodes in the first inner layer, and I assigned an activation function of rectified linear unit, relu. Second layer, I have again, 32 nodes, same activation function, and the last layer, I have two outputs because I'm using Y categorical which has two columns. Notice that since this is a binary classification, I could also have used a model with one if I use only one of the two columns, provided that I put a sigmoid here and I use binary cross entropy for the loss. No harm done using two outputs and a categorical cross entropy as long as I use the soft max activation function at the end. The results gonna be the same. 

Okay, so I compile the model and let's check it so if I do model summary, I see that I have three dense layers. This has an output shape of 32 nodes and two, and these tell me the number of parameters. This is 32 times eight and we can check that 32 times eight is 256 plus 32 biases, it's 288 which is exactly the number of parameters in this layer. Same, I'll let you verify that this is the number of parameters connecting 32 nodes to 32 nodes and finally we have 32 going into two, so we have 64 plus two, 66. Total number of parameters in our model, it's slightly more than 1,400. We fit the model with the rows equals two and validation split of 10%. Okay, let's see, our loss is going down but not too much. Our validation accuracy here is kind of fluctuating between 0.8 and 0.7 but it does not seem to be improving very much and so we stop here and check how well our model is doing. So we predict the probabilities generate the classes for the test and the predicted set and finally we check the accuracy and the classification report and our confusion matrix. Okay, is the model doing well or not. To answer that question, we always need to check what our benchmark is and so, I will do that by counting how many values there are in the test class. 

So the easy way to do this is to put the data into a series and use the function value counts and we can see that there are 100 in class zero and 54 in class one. We could've seen it from the support of the two classes as well so basically our benchmark is, if we say that nobody gets diabetes, so we can divide this by the length of Y test class to get the proportions and we see that 64.9%, 65% is our benchmark. So if we that everybody is in class zero, we are correct 65% of the time. So an accuracy score of 68.8 is very mildly better than our benchmark so model is not that great and yeah, we can see in the confusion matrix that still saying that a lot of people that actually have diabetes are diagnosed not to have diabetes. To improve this model, you can go back to the drawing board and change your model. The truth is this data set is quite overlapping and so it's not guaranteed that you're gonna do any better. Thank you for watching and see you in the next video.

About the Author
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.