The course is part of these learning paths
This course covers the concept of unsupervised learning within the context of machine learning. You'll learn the fundamentals of unsupervised learning and how it differs from supervised learning. We'll cover the topics of clustering, k-means clustering (and its limitations), and dimensionality reduction.
- In this session, I would like to talk about a form of Machine Learning that is quite different than the sorts you've seen before called Unsupervised Learning. And the contrast here is with Supervised Learning, where the dataset we look at contains both features, which we use to find patterns that allow us to predict the target, which is in the historical dataset that we have seen. So just be very clear here, what we have seen so far has been called Supervised Learning. And that is the case, where the dataset we have contains both the features and the target, predictive target and the goal of this form of learning is to infer or estimate or arrive at by some algorithm, a model which we have called f hat that accepts the features that we are looking at and produces an estimate for a target, right? So in the classic case, take an image and predict whether it is a cat or dog or something like that. Now in Unsupervised Learning, we have a dataset that does not contain a predictive target. Now, of course, a dataset is a dataset, right? So what we mean to say when we say a dataset does not contain a predictive target is that maybe we do have some predictive goal in the end, but it is not a goal of estimating any of these columns that we are not considering. And so when we are doing unsupervised learning, we are operating on somehow finding patterns in transforming columns that are not going to be part of a prediction target now or in the future, right? So that's Unsupervised Learning. What should we say more about this? Now in the case of Supervised Learning, the goal is very clear, It's this prediction of this model. In Unsupervised Learning there is no sort of overarching goal which unifies the discipline into something that you can go well, here's how you do it in this case, here's how you do it in that case. For supervised learning, it's more of a random collection of tools, techniques, approaches that are mostly helpful for Supervised Learning. That is to say there may be a feature in a step before Supervised Learning, or there may be involved in the preparation of a dataset. There may be even perhaps something we do after the supervised process. Now Unsupervised Learning is not such a discipline. There is no overarching idea or goal, which brings together all the tools or techniques. In fact, Unsupervised Learning really is just a cluster, a grouping of basically unrelated techniques and approaches that are of some help in the supervised process. Let me give you some examples to make this concrete. So for example, one unsupervised technique is what I would call Data Compression. Now Data Compression is my simplified term. It's a very common term, but a technical term here is a Dimensionality Reduction. We're gonna look at that in detail. But for now let me just say a little bit about it and how that's gonna fit into the process. So data compression dimensionality reduction, you're gonna be often a data preparation step. That is to say we will take a dataset in a supervised context, which is far too large, far too varied, far too complex, for our goal and compress it down to a smaller size so that we can feed it into the methodology machinery that was established in supervised case. So that's going to be a preparation step mostly. What else could we do in the Unsupervised Learning? Another thing here, and to make it clear how random a collection of stuff this is. Let me just tell you that taking the mean of the feature, is actually a form of Unsupervised Learning. So as in many areas of machine learning, these labels that we give to things unsupervised, supervised, and so on, these are just, I would almost go say they have a marketing role in selling certain ideas, but they also have a technical role in just saying we're taking things we've seen in statistics, things we've seen in other areas and then contextualizing them putting them into a certain problem as understanding the problem. So taking the mean is actually something that will be part of Unsupervised Learning so long as it was somehow involved in the learning life cycle. What else could we be doing? So in fact all of descriptive statistics is really a form of Unsupervised Learning. And I think about why that is, that is because it's a statistical operation on a feature like an X, right? And it is not predictive. So there's no uncovering of correlations, there's no uncovering of models or connections. We're just summarizing a particular column, like the mean of it. And then, because that is just an operation on a column, we would call it Unsupervised Learning it is not predictive. What else could we do data compression, taking the mean? Another major area here is Clustering. It's actually related to taking the mean. Clustering means finding patterns in particular groups or clusters in features themselves. That is to say, suppose I had a column of ages, is it this case that in my column of ages, there are certain age groups that I see. So for example, if I go to a theme park, do I see a majority of people perhaps being clustered around 10, 11 with another cluster at 40, with another cluster at 60. I'm thinking here, the children, their parents, their grandparents, do we in fact see fewer, perhaps people from 25 to 35, maybe because I don't know if that will work or something, right? So clustering is about, let's say finding, say patterns, say groups, clusters in and here's key thing, in features. It's not about connecting features to predict your targets, but say in a particular feature I'm looking at, am I seeing a pattern in how my observations are grouped together, how they're clustered, right? So that's a good overview of the key areas of Unsupervised Learning. So Unsupervised Learning is mostly about a kind of preparation tool set and an exploration tool set. That's how we're gonna be using clustering. What we've called here Explanatory Data Analysis. So if you think about the supervised learning process here, now what's the first step. The first step in terms of dealing with the data is going to be looking at the data, visualizing the data, asking if there are any patterns in the data itself before we start modeling it, start coming up with a prediction model. Key question here of course is, do my features come in groups? Is there a pattern to my observations, right? So once we've gone through that exploratory phase, we then go through the learning phase. But if the learning phase is dealing with datasets, which are too complicated, too massive, too varied, we would then need to look at data compression before we went through the modeling. So you can see here that this is a random bag of tools that are unified just by how helpful they are to the learning process, they're not necessarily here of the same kind, they're not doing the same kind of thing. So what we're going to do in this section of the course, we gonna have a look at Clustering first and then Data Compression. And I'll expand a little about the phrase Dimensionality Reduction, and here then you'll be able to see how all this plays out in more details.
A world-leading tech and digital skills organization, we help many of the world’s leading companies to build their tech and digital capabilities via our range of world-class training courses, reskilling bootcamps, work-based learning programs, and apprenticeships. We also create bespoke solutions, blending elements to meet specific client needs.