- Home
- Training Library
- Module 7 - Probability and statistics

# Probability distribution

## Contents

###### Practical Machine Learning

## The course is part of this learning path

**Difficulty**Beginner

**Duration**1h 3m

**Students**3

### Description

This course explores the topic of probability and statistics, including various mathematical approaches and some different interpretations of probability. The course starts off with an introduction to probability, before moving on to cover the topics of Bayesian probability, Frequentist probability, statistics, probability distribution and normal distribution.

### Transcript

- At the end of last session, we began talking about the normal distribution, right? Which is one example of a probability distribution or a statistical distribution that concerns itself with the effect of various probabilistic events. The two distributions we can have are probability mass functions, probability mass functions, and probability density functions, probability density functions. Density. Now the normal distribution is in fact a density function. Let's have a look at the mass function first. Now the idea behind a probability mass function is thinking about probability as a sort of mass. That is to say, we're making an analogy with a physical weight, like a weight or something, right? And when we're describing probability distribution, how probability lies around potential outcomes that we are looking at, we describe where the mass is, where the probability is. So look at this. This is also known as a PMF. Most famous example. Most common example of a probability mass function is known as the Bernoulli distribution, which is B-E-R-N-O-U-L-L-I. Bernoulli. Very simple. It's a discrete distribution, a max distribution. So there's some fixed number of options, and some of the probabilities in some place, some in another. So here, we've got two options. We've got, let's say head's tails, success failure, yes no, whatever binary question we're asking. Let's go for success and failure or succeed and fail. Maybe this is a failure of the server, failure of a fire system, failure of some safety management system. So the probability of failure will be hopefully very low. Let's say... Let's say that's 5%. And that's where the mass is. And all the rest of the mass, the 95%, is in success. Let's go up here and say, this is 95%, right? That's a Bernoulli distribution. A distribution parametrized by one number, often called P, and P is the probability of one of the outcomes. Let's say probability of success, 95%, 0.95. And that's the only number characterizing the full layout of this distribution. Now this is the mass function because discreet mass. Discreets, there's a blob here, blob there. Probability of success probability of failure. And we read the vertical here as a probability of a particular outcome, so here, probability of outcome. Now, let's talk about a density function. Probability density function. The idea behind a density function is that the mass is spread out across lots of different outcomes and it spreads through a continuous range of outcomes. So let's just show you. So for example, with height, height, we've got a number going from let's even zero to two meters, say, and it isn't as if there's a discrete number of options, like 1.8, 1.9. it's that there's 1.8001, 1.8002, 1.8003. So there's any particular height. There's a particular density of probability. Yeah. We'll draw maybe a bi-modal one. Just to say a density with two modes, two most common points. So let's go for, oh, I don't know, the heights of women. Say on average maybe 1.65. I don't know, heights of men, say on average 1.8. Probably a little lower in both cases, but whatever. And what we would see here is just to modes like that. Of course the same number of men as women in the world, roughly. So we should expect these to have the same height. Now, how do we read this? Well, let's compare it to the mass one. In the mass case, the height of this line was the probability. So height of the line is the probability. In this case, the height of the line is the density. What that means is in order to compute a probability, we require a range in the variable of interest to produce an area, and that area is the mass. So in the density case, probabilities are areas. They are not heights. So here 165 to let's say 170. This area here, that area is the probability. Now let's explain why we would have a density and why we'd have a mass. What causes us to need a density rather than a mass? What causes us to need area rather than just height? Well, the short answer is whether or not you are measuring something continuous. Do you have a yes no? Or do you have 1.1, 1.2, 1.3? But why should a continuous outcome, why should that screw things up for us? Here's the answer. It's a very peculiar sort of answer. The answer is this. That if I'm looking at say someone's height or their age, you know, whatever it may be, height, then the height could be anything from let's say 1.8 to 1.80000001. Or 1.800002. And here's a question. How many people have a height of 1.800000001. The answer is no people have that height. That for every specific point in our range that we could be considering, for every highly specific 1.80000001, no people have that height. So if we just take the number of people with that very specific height as a probability measure, the probability will be zero. And that's true. Probability of any real number is zero. So when we're considering probabilities of continuous variables, we have to take a range. What we say is probability of your height being, say between 1.801 and 1.799. And this range gives you your tolerance or your width of your measurement. And then the area under that width is the probability that you will fall between that range. The probability here let's say would be... Well, I don't know, maybe 5% of people are that high. It seems unlikely, seems less than that. But whatever, 5% is fine. And that's that. So that's probability mass functions and probability density functions. Let's now go on to talk about the most famous probability the density function of all, the normal distribution.

**Students**2698

**Labs**19

**Courses**20

**Learning paths**22

QA is the UK's biggest training provider of virtual and online classes in technology, project management and leadership.