Machine learning is a branch of artificial intelligence that deals with learning patterns and rules from training data. In this course from Cloud Academy, you will learn all about its structure and history. Its origins date back to the middle of the last century, but in the last decade, companies have taken advantage of the resource for their products. This revolution of machine learning has been enabled by three factors.

First, memory storage has become economic and accessible. Second, computing power has also become readily available. Third, sensors, phones, and web application have produced a lot of data which has contributed to training these machine learning models. This course will guide you to the basic principles, foundations, and best practices of machine learning. It is advisable to be able to understand and explain these basics before diving into deep learning and neural nets. This course is made up of 10 lectures and two accompanying exercises with solutions. This Cloud Academy course is part of the wider Data and Machine Learning learning path.

**Learning Objectives**

- Learn about the foundations and history of machine learning
- Learn and understand the principles of memory storage, computing power, and phone/web applications

**Intended Audience**

It is recommended to complete the Introduction to Data and Machine Learning course before taking this course.

### Resources

The datasets and code used throughout this course can be found in the GitHub repo here.

Okay, so welcome back. Here's a way of checking several values of b, exploring the cost function at once. So I have here my data points plotted, and I create an array of different values for b, like minus 100, minus 50, zero, 50, 100, and 150. Then for each of these values, I calculate y predicted for a certain value of w, in this case, I set w equals two, and I calculate the mean squared error between y true and y predicted, and I append the mean squared error that I just calculated to an empty list. Finally, I plot the line of this particular b.

Then I can plot the corresponding costs for the different values of bs, and the mean squared errors that I've collected while looping over the different bs. So if I execute this cell, it will take a bit of time, but at the end, it will show us a few lines, a few options of lines, and the corresponding cost. So here they are. Here are the different values of b going from minus 150 to, what was it, minus 100 to plus 150. And these are the corresponding costs going from minus 100 to plus 150.

So as you can see, the cost decreases and somewhere near here, which is somewhere here, we have a minimum cost. Obviously, if we start changing also the slope, we will go even further down and reduce the cost. Great, so linear regression allows us to do this process automatically, and that's what we're gonna do next. Let's do it with Keras. There are a lot of packages that implement linear regressions. We can do it in scikit-learn. We can do it even with SciPy, with the linalg subpackage.

I want to do it in Keras because I want you guys to start familiarize with the Keras API. We will go through it in greater detail as we discover neural networks, but it's great if you already start to getting familiar with that. So the first thing we are going to import, like the demo we did in the first section, is the type of model. This is called sequential, and it's called sequential because we are going to be adding elements to this model in a sequence.

To build a linear regression, we only need the dense type of layer. The last thing we import is a couple of optimizers. These are the things that change our values of weight and biases looking for the minimum cost. Okay, so we define our model to be a sequential model. Let's execute this cell first. And then we add, we add to the model a dense layer. Okay, so this dense layer, I can check the documentation, and see what it says. The first parameter here is the number of units. What this means is how many output values will this model have? Well, since it's a linear regression that takes one value, x, as input, and one value given in output, the y hat, we only need one value in output, so that's why it's a one here. And then we say the input shape, the shape of our input variable, is just one number, x. So basically, the dense layer at its simplest, at the core, it just implements a linear regression. Okay, so it has essentially the same function as the line function we've defined here. This is our input, one input, and this is our output, and what the dense layer does is the same thing.

So that's got lot more functionality but at the core, it's a linear function. Okay, so we add that to the model, and with the model.summary function, we can check our model. As you can see, there is only one layer. It's called dense_1. The output shape is one number, and it has two parameters. These are the weight, single weight, and the bias. Notice that the output shape is given with this weird tuple that says None, 1? The reason for this is that the model can accept multiple points at once, so instead of passing a single value for X, we could ask for predictions for many values of x in one single call, and this would be equivalent to what we did here when we say line of X, where X now, it's a list of values of X, and we obtain a list of corresponding values of y. So that's why it says None here, because it's a placeholder for saying give me however many values you want in input as long as each of them is a single number, like you see here, and I will give you a single number for each of them.

Okay, perfect, so the important part in Keras is to compile the model. When you compile the model, what Keras does, it will construct the model using the back-end software that you define. So in the case of this course, we are using TensorFlow as back end, so this model will be implemented as a TensorFlow model. Keras is a very nice API, high-level API for designing neural network models, and it supports multiple back ends, so you could define, for example, Theano, which is another library, and have your model be compiled in Theano.

Now the great thing is all you need to change for this to happen is the definition of the back end, but the definition of the model here is exactly the same, so the Keras code doesn't change. This, I find really amazing and powerful. Now notice here, it says using TensorFlow back end when we loaded Keras for the first time. So it's telling us that right now we have set TensorFlow to be the back end. Great, so we compiled the model. I'm not gonna go into the details of this. We will discover it later, but notice that the second parameter says which cost function we are gonna be using, and we find the mean squared error, which is what we expect. So in the definition of the Keras compiler, this is called the loss. That's an equivalent name. We can call it loss, we can call it cost function. It's the same thing. It's the cost attributed to our model, to our prediction.

Okay, so we've compiled the model. Now we fit the model on our input data, and on our output data, so these are the two variables that draw these points. So the height and the weight. Okay, and again, you will see that Keras will compile and execute some iterations, and you can monitor the loss, and you see that it's going down. So this I showed you already the last time. So what's happening is Keras is exploring different values for W and B, and trying to find the values of W and B that make this cost the minimum, the lowest possible. So I gave it a time limit for this exploration. I told it, "Try 40 times and then stop." So at 40 epochs, it stopped and now we can generate predictions with our model.

So we call the model "predict". I'll do this in a separate cell, and we can plot the predictions. I'll just make this one visible. We can plot the predictions with a scatter plot. So yeah, not bad, look at this line. It's really quite nicely reproducing the best fit of our data, and notice that the cost here, 182, our last cost, is much, much lower than the cost we had obtained above here, 27,000, for our straight line passing through zero, which is what we want. If the line is closer to the point, the cost should be lower.

Okay, awesome. So the last thing I want to show you is we can actually extract the values of W and B by doing model get weights, so let's do that and see what they are. So W is an array with a single value because, in this case, the model only has one weight, and the value of the slope is 7.6. The value of B is actually a negative 350 almost, which means that this line will cross at X at height equal zero, the value minus 350 for the weight. So there is a negative bias. This is how we run a simple linear regression model in Keras. Thank you for watching and see you in the next video.

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-founder at Spire, a Y-Combinator-backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.