Practical Machine Learning
The course is part of this learning path
This course explores the topic of probability and statistics, including various mathematical approaches and some different interpretations of probability. The course starts off with an introduction to probability, before moving on to cover the topics of Bayesian probability, Frequentist probability, statistics, probability distribution and normal distribution.
- In the first section, we looked at probability mostly from a classical point of view. What we did is we established a couple of seemingly toy problems, coin flip, a dice roll, and we analyzed the probabilities involving various outcomes in terms of an equi-probable set of events. And we come up with this very simple ratio, a very simple counting ratio. Count the number of possible outcomes, count the number of events of interest, divide one by other, that gives you the somehow correct probability. Correct number. The true number. That was classical probability. But what we said is, that there are actually limitations around this. Can you define... Can you define equi-probable outcomes? Can you even define outcomes at all? Let's now look at a different interpretation to probability, called a frequentist interpretation, or frequentism. And how that can give us another set of tools, another set of formulas, another set of ideas for working on probability problems. So this is frequentist... probability. Probability. And here, the defining formula is a ratio again. But now it's a ratio between a count of outcomes. Let's have a look. So if I say, what did you find for me here? The probability of some event, say E this time, why not? Is now gonna be given as a long-run frequency. Long... run... frequency. Which on the top, is the number of times the event has happened. And on the bottom, is the number of trials, or experiments, or attempts that we have made. Let's just give this in English. This is number of times E has happened. E happened. And on the denominator here, number of trials that we have made. Again, trial here being essentially, technically meaning experiment, or run, or any time that we have tried to find an outcome of something. Now, this isn't quite complete, this formula. This isn't quite complete. This ratio is what's known as an empirical. Empirical frequency, or an empirical probability. Empirical in the sense that what we have had to do is run some experiment some number of times, and then the ratio is just what happened. So for example, talk about a coin flip here, maybe I flip a coin a hundred times, and out comes, you know, 47 heads. So we give as a probability of getting heads of 0.47. Now we feel of course intuitively that 0.47 isn't the true probability. Now there is some circumstantial reason that heads have come out 47% of the time. And if we went to 1,000, or if we went to a million, or some other very large number, we would get closer and closer to the true answer. So let's say, let's go, suppose we get 490,000, for the sake of keeping numbers simpler. And you can see here, that as we increase the number of trials, what we probably get closer and closer to is 0.5. There is some sense, some way that reality is conspiring to give us a long-run answer. That, you know, every time I flip a coin, there's something about the coin, which over a very large number of trials, gives us this pattern of half of heads and half of tails. And the idea behind an empirical frequency is to try and get at what that true frequency is. But what we're going to do is define, what a frequentist would do is define the probability as the long-run ratio. And here, the notation here is gonna be, let's say the limit, as the number of trials goes to infinity. So let me just box this thing here. And then that's the complete idea. The complete idea is that there is gonna be this ratio, number of events that we're interested in, divided by number of trials, but then as the number of trial goes to infinity. Now, that gives us a way of avoiding the outcome space issue, we don't have to do any outcome space stuff, we don't have to sort of use our pure reason to come up with a set of outcomes and then do ratios. Very mathematical, very pure approach to determining probability. Can't be done with election, can't just go well, you know, configuration A, configuration B, there is no way of defining that. With the frequency, what you can do is you can just go, let's just try it a few times. Keep trying, keep trying it, and there'll be some underlying physical parameters, underlying physical, geometrical, causal, whatever it may be, aspects of the system that deterministically lead us towards a closer and closer answer as to what the underlying probability is. Okay, so with a coin, the geometry is that it has two sides. And therefore if you flip it in such, in a fair way, because half of the area if you like a coin, is on one side, and half the area is on the other side. Then there's a sense in which the geometry of the coin determines this ratio. And the frequentist is happy, for reality if you like, through experiment to determine the ratio. Now, here's the problem with frequentism. Here's the limitations. We've avoided one limitation, the limitation we got over from classical probability is defining outcome space. There's no outcome space here, there's just a frequency, there's just a ratio of things we're measuring. So we've dealt with the outcome space being biased and so on. Problem in here is that we can't run an experiment an infinite number of times. And there are certain cases, selection for example, where we can only run it once. So if our probability is meant to be this long-run thing, where we repeat our experiments, and repeat them and repeat them and eventually converge to find this point of the truth. We can't do that with a large class of things which we apply probability technology to. How probable is it it will rain tomorrow? Well tomorrow only happens once. What does that even mean in terms of a trial, an experiment? I can't take the same conditions, identical conditions, which lead to rain. Reverse them a little bit, and then replay them again to see if it'll be sun. So what is it I mean by a trial? How can I run a trial many many times, like, you know, many times I can't do this. So, a key limitation here is cannot run experiment many times. So we can't use this idea when there is, when we have, when we can't run this trial lots of times. So what we're gonna do now then, is move to our final interpretation of probability, sometimes known as a basing interpretation, or basing statistics, however you want to call it. We will look at that limit, we're gonna look at that idea and in my view, I think, it is perhaps the most generally useful idea the most generally useful interpretation. It gives you a meaning, it gives you a way of ascribing a probability to something in every case you can imagine. It doesn't seem to me, to be as limited as these other things. But it's a little subtle, little counterintuitive maybe. So we have to go on to that and look at it in detail.
QA is the UK's biggest training provider of virtual and online classes in technology, project management and leadership.