Multidimensional Linear Regression - Part 1
Maths for Machine Learning
The course is part of this learning path
To design effective machine learning, you’ll need a firm grasp of the mathematics that support it. This course is part two of the module on maths for machine learning. It focuses on how to use linear regression in multiple dimensions, interpret data structures from the geometrical perspective of linear regression, and discuss how you can use vector subtraction. We’ll finish the course off by discussing how you can use visualized vectors to solve problems in machine learning, and how you can use matrices and multidimensional linear regression.
Part one of this module can be found here and provides an intro to the mathematics of machine learning, and then explores common functions and useful algebra for machine learning, the quadratic model, logarithms and exponents, linear regression, calculus, and notation.
If you have any feedback relating to this course, please contact us at firstname.lastname@example.org.
Now that we've seen all of the, kind of, essential core mathematics behind, you know, most of machine learning, calculus and any algebra, we can, I think, reformulate linear regression, go over it once more, building the entire picture out. So, here we're looking at multi-dimensional linear regression to give us a sense of how all of this mathematics fits together. So, you could say, Multidimensional. Let's call it L.R., linear regression, right. So what is this, what is this? Well, in a multidimensional case, let's call the visuals, let's do the formal setup first, so we've got a y which is our target, but now we've got multiple x's so you can think of this as having x one to x n as our features, or we could say, well, actually there's a matrix, well, or we could say that the matrix is featured in the order too or you could say that actually that each individual X is a vector, is what we will do. So rather than saying those multiple like that, we'll actually just say that there's, select it, and there's going to be a function which connects our vector, the prediction we want to make. Cool, so let's just choose a problem. So here we can go with maybe grade of a student, and now we've got several features as a vector. So what can this be? This can be maybe the hours you spend studying, as studying and, you know, previous grade average, GPA, grade point average, previous grade. Now, let's add a little bit more notation then we'll just see. So, when setting this problem up, what we'll do, what we can do as well in mathematics is just say actually this is a real number. So, say y is a real number, and that kind of gives us the type of y or tells us what y is or is going to be. And, we kind of should probably do the same for X now, but X is a vector. X is a vector, so how do I say it's a vector in this location? Well, it could be two real numbers. There's gonna be one real number and then another real number. So the way we write that is R two. So as a side point, let's just talk about why there's a two there. So, side issue here, you know, imagine I say the a can be the numbers one or two, right? What that means is that my options for a is I can have a number one or I can have a number two. Well, imagine I've got a vector now, it's called a vector, I don't know, a vector, and in this vector I've got two entries, a one and a two. And so now I've got two options for a one, so it could be one or two, and I've got two options for a two, so it could be one or two. So how many options are there for the vector a? Well there are four. You know, and the vector a could be one one, two one, one two, or two two. And those are all the options that there can be. So if you think about the right hand side here, it's telling you kind of how many options there are, then if there are two options here, and if I have two a's, there are four options, two squared. So that's like a little side point here, about this notation. The long and short of it is that you can read this power here, as kind of just the number of entries in a vector. So when you see a little power there, that means a vector, and you know a matrix would be two by two. That would be a two by two matrix. So, if you're corresponding mathematics here to say Python, the way we do this is we put the shape up here, so that's kind of like the shape. And you can, hopefully you can see the side point why, what's the motivation for that is, but it actually doesn't really matter whether you follow that side point or not, you can kind of, the key thing is the shape is the two there, so there's two entries in our vector. Right, okay, so we've got a little vector x, so we've got a y, let's, let's just visualize then, so this is a notational setup. You know, maybe we want to define even at this point also define a loss, right? So you could also say there's a loss, that compares our predictions to our observations, which we can now phrase a little bit more helpfully, which is the loss is gonna have, it's actually gonna be computed by varying the parameters, which we'll call w, holding fixed the data set, and the historical targets. So before we had a and b, now, rather than having a and b, we can say that f is actually going to be the linear model, and that's just going to be w dot x, you know which is w one, x one plus, da, da, da, da. Right. So let's, maybe even let's define this loss here, why not? So this loss is the mean square arrow, or the square of each individual point. So what's that? That is, well it's the weights times x, gives us our prediction, that's what I have, and we take away y and now the formula is that thing squared. Okay, so this is the setup. So this is the loss we're going to use, that's the loss we're going to use, that's the model. The model. That's the loss, here are the features, Here's the target. That's pretty much everything you need to say. And there's a small adjustment, there's a small adjustment you're going to make. We can come back to that, I guess, small adjustment, but we'll leave that as the setup and then we'll talk about the visualization of this thing next.
Linear Regression in Multiple Dimensions - Interpreting Data Structures from the Geometrical Perspective of Linear Algebra - Vector Subtraction - Using Visualized Vectors to Solve Problems in Machine Learning - Matrices - Multidimensional Linear Regression Part 2 - Multidimensional Linear Regression Part 3
Michael began programming as a young child, and after freelancing as a teenager, he joined and ran a web start-up during university. Around studying physics and after graduating, he worked as an IT contractor: first in telecoms in 2011 on a cloud digital transformation project; then variously as an interim CTO, Technical Project Manager, Technical Architect and Developer for agile start-ups and multinationals.
His academic work on Machine Learning and Quantum Computation furthered an interest he now pursues as QA's Principal Technologist for Machine Learning. Joining QA in 2015, he authors and teaches programmes on computer science, mathematics and artificial intelligence; and co-owns the data science curriculum at QA.