## Hypotheses and Errors

Let’s start with identifying what we mean by the term **hypothesis**:

- A proposed explanation for a phenomenon.
- A proposition, claim, idea.
- A statement of what the researcher(s) predict will be the outcome of a study.

A scientific hypothesis must be possible to test and has nothing to do with beliefs or morals.

In science, a hypothesis is a prediction or explanation that is tested by an experiment. Experiments may disprove a scientific hypothesis but can never entirely prove one.

Let’s check these examples:

- Sales volumes depend on product prices.
- Exam results depend on the amount of study hours
- A new drug leads to different (better) medical measures of the trial group compared to the control group.

What would your own hypothesis example be that can be added to the list?

Our hypotheses relate to causal effects (X influences Y) or differences (A is different to B). They are **alternative hypotheses** (𝑯_𝑨 or 𝑯_𝟏).

To every alternative hypothesis corresponds a **null hypothesis** (𝑯_𝟎) denying any causal effect or difference and attributing it to random sampling error. For example:

- Sales volumes don’t depend on product prices.
- Exam results don’t depend on the amount of study hours.
- The trial and control group in a trial of a new drug don’t show any difference.

… and if a dependency or a difference is found, it is purely a coincidence. It is a result of random sampling error.

*| Image: alternative and null hypothesis |*

The alternative and null hypotheses are completely opposite and rejecting one means accepting the other.

The default position in a statistical test is that the null hypothesis is correct. The aim is to reject it.

In order to understand the below example, let’s suppose we take part in a trial in a court of law.

Let’s use the initials **H0** if the defendant is innocent and **HA** if the defendant is guilty.

The trial starts with the presumption of innocence.

Then, the prosecutor presents the evidence. The prosecutor has to convince the jury beyond reasonable doubt that the defendant is not innocent.

How likely is that this could have happened by chance if the null hypothesis were true?

How unlikely is unlikely?

In reality, the defendant is either innocent or guilty. What if the jury has not ruled correctly?

If the defendant is innocent but the jury rules they are guilty, we have:

- A miscarriage of justice.
- An error type 1 (False Positive).

If the defendant is guilty but the jury rules they are innocent, we have:

- A guilty person who goes unpunished.
- An error type 2 (False Negative).

In reality H0 is true or not. Our conclusion to accept or reject H0 either matches the reality or not.

If H0 is true and we have concluded we accept it, our conclusion is correct.

If H0 is false and we have concluded we reject it, our conclusion is correct.

If H0 is true and we have concluded we reject it, this is Type 1 error, also known as a “**false positive**”. This is the error of accepting an alternative hypothesis when the results can be attributed to chance. We are observing a relationship or difference when actually there is none (or it is not statistically significant).

If H0 is false and we have concluded we accept it, this is Type 2 error, also known as a “false negative”. We are failing to observe a relationship or difference when in reality there is one.

*| Image: Error types | *

**Statistical significance**

We work with samples and there is always the possibility that the results we have obtained have occurred by chance.

There is the probability that the results have occurred by chance, i.e.:

- The null hypothesis is true.
- The results are due to random sampling error.

If that probability is small, the results are statistically significant. This probability is called **p-value.**

If the p-value is small enough, we can reject the null hypothesis. We decide what is small enough. Usually, it is accepted to be **0.05.**

**p-value < 0.05**

The results are statistically significant.

There is less than a 5% probability that the null hypothesis is correct, and the results are random. Therefore, we reject the null hypothesis, and accept the alternative hypothesis.

**p-value ≥ 0.05**

The results are not statistically significant.

They indicate strong evidence for the null hypothesis. Therefore, we accept the null hypothesis and reject the alternative hypothesis.

Next, we will learn about data distributions, most importantly, the Normal distribution.

When you’re ready, select **Next** to continue.

In this Course, we will find out about the concepts underpinning Statistics.

A world-leading tech and digital skills organization, we help many of the world’s leading companies to build their tech and digital capabilities via our range of world-class training courses, reskilling bootcamps, work-based learning programs, and apprenticeships. We also create bespoke solutions, blending elements to meet specific client needs.