Hypothesis Testing

A LevelAQAEdexcelOCR

Hypothesis Testing Revision

Hypothesis Testing

A hypothesis test uses some sample data to test whether a hypothesis (a belief about the distribution of a random variable) is true. Hypothesis testing comes with a substantial amount of terminology.

A LevelAQAEdexcelOCR

Terminology for Hypothesis Testing

  • Hypothesis tests are based on two hypotheses. The null hypothesis, H_{0}, is a statement about the value of a population parameter (a parameter of the distribution of a random variable) which our data will tell us whether or not to reject. The alternate hypothesis, H_{1}, is what we believe the parameter is if we reject the null hypothesis.

 

  • In general, the null hypothesis is something that we want to show is false. This is because hypothesis tests do not show if something is true, only if something is false, but we can find out something true that we want to know by showing that the null hypothesis is false. To this end, the null hypothesis is usually that a parameter takes a specific value, while the alternate hypothesis is usually that the parameter differs from the specific value, rather than specifying another value.

 

  • A hypothesis test is the means by which we generate a test statistic that directs us to either reject or not reject the null hypothesis.

 

  • The test statistic is a “summary” of the collected data, and should have a sampling distribution specified by the null hypothesis.
A LevelAQAEdexcelOCR

One or Two Tailed Tests

Hypothesis tests can be one tailed or two tailed. This depends on H_{1}.

  • In a one tailed test, H_{1} takes the form p>x or p<x, where H_{0} is that p=x.
  • In a two tailed test, H_{1} takes the form p\neq x, where H_{0} is that p=x.
A LevelAQAEdexcelOCR

Significant Data

We reject the null hypothesis when the data we observe is unlikely to have occurred if it were true.

Specifically, we state a significance level \alpha before we perform the hypothesis test, and if the probability of getting the data we got is less than \alpha in a one tail test or less than \dfrac{\alpha}{2} in a two tail test if we assume H_{0} is true, then we reject H_{0}.

The way we test the probability of getting the data is by looking at the sampling distribution of the test statistic, which is set by the null hypothesis.

You will usually be told what significance level to use. Common significance levels include 5\%\;(\alpha)=0.05 and 1\%\;(\alpha)=0.01

A LevelAQAEdexcelOCR

Critical Region

The critical region is the set of values of the test statistic that would cause H_{0} to be rejected. The first value inside the critical region is called the critical value. If the test statistic is as extreme or more extreme than the critical value, then we reject H_{0}.

A one tailed test has a single critical region, containing the highest or lowest values. A two tailed test has two critical regions, one containing high values and one containing low values.

You can test whether your data is significant by finding the critical region and seeing if the test statistic falls within it.

A LevelAQAEdexcelOCR

Actual Significance Level and p-value

The p-value is the probability of obtaining the results we got if H_{0} is true. If the p-value is less than \alpha (or \dfrac{\alpha}{2} for two tail test) we reject H_{0}. If the p-value is greater than \alpha (or \dfrac{\alpha}{2} for two tail test) we do not reject H_{0}.

 

The actual significance level is the probability of the data being in the critical region if H_{0} is true. For continuous data, this is the same as the significance level. However, for discrete data, it might differ.

A LevelAQAEdexcelOCR
A LevelAQAEdexcelOCR

Example 1: Binomial Hypothesis Test

X\sim B(10,p) and we observe x=6

Test, at the 5\% significance level, if p is larger than 0.4

[4 marks]

H_{0}: p=0.4

H_{1}: p>0.4

\alpha=0.05

\mathbb{P}(X\geq 6)=0.1662>0.05

So we are more likely than the significance level to get the data we observe.

Hence, do not reject H_{0}. There is not significant evidence to suggest p>0.4.

(Note: More on binomial hypothesis testing can be found in the section Binomial Distribution Hypothesis Tests)

A LevelAQAEdexcelOCR

Example 2: Normal Hypothesis Test

The amount by which the train Nicola takes to work is delayed is normally distributed. Observations over a number of years show this delay has a mean of five minutes and a standard deviation of two minutes. Nicola believes this has changed. If she is delayed by ten minutes on Friday, is she right at the 5\% significance level?

[6 marks]

X\sim N(\mu,2)

H_{0}: \mu=5

H_{1}: \mu\neq 5

Significance level \alpha=0.05

Two tail test so we are looking for a probability less than \dfrac{0.05}{2}=0.025

Observed data: x=10

\mathbb{P}(X\geq 10)=0.0062<0.025

Reject H_{0}. There is sufficient evidence to suggest that the average delay has changed.

(Note: More on normal hypothesis testing can be found in the section Normal Distribution Hypothesis Tests)

A LevelAQAEdexcelOCR

Hypothesis Testing Example Questions

Question 1: Jane has a die that she believes is biased towards rolling 6. Give hypotheses that could be used to test this.

[2 marks]

A Level AQAEdexcelOCR

H_{0}:\mathbb{P}(X=6)=\dfrac{1}{6}

 

H_{1}:\mathbb{P}(X=6)>\dfrac{1}{6}

MME Premium Laptop

Save your answers with

MME Premium

Gold Standard Education

Question 2: Is the following a one tail or two tail hypothesis test?

H_{0}: p=0.5

H_{1}: p\neq 0.5

[1 mark]

A Level AQAEdexcelOCR

The alternate hypothesis is p\neq 0.5 so this is a two tail test.

MME Premium Laptop

Save your answers with

MME Premium

Gold Standard Education

Question 3: How close to the bullseye a dart lands in the dartboard when thrown by Phil has a normal distribution. Phil believes his throws have a mean distance from the bullseye of 1cm, with a standard deviation of 0.4cm. He throws a dart and it lands 1.5cm from the centre. Test, at the 5\% significance level, if he is as good as he says.

[6 marks]

A Level AQAEdexcelOCR

X\sim N(\mu,0.4)

H_{0}: \mu=1

H_{1}: \mu>1

Significance level \alpha=5\%

Observed data: x=1.5

\mathbb{P}(X\geq 1.5)=0.1056>0.05

Do not reject H_{0}. Insufficient evidence to suggest Phil is not as good at darts as he says.

MME Premium Laptop

Save your answers with

MME Premium

Gold Standard Education

Question 4: Consider the hypothesis test on X\sim B(20,p).

H_{0}: p=0.75

H_{1}: p\leq 0.75

Take the significance level to be 0.05

i) What is the critical region for this test?

ii) What is the actual significance level of this test?

[4 marks]

A Level AQAEdexcelOCR

i) The critical region is the region for x for which we reject H_{0}.

Critical region is such that \mathbb{P}(X\leq x)<0.05

In this case, \mathbb{P}(X\leq 12)=0.1018>0.05 while \mathbb{P}(X\leq 11)=0.0409<0.05

So the critical region must be the values 11 and under.

 

ii) The actual significance is the probability of landing in the critical region so it is \mathbb{P}(X\leq 11)=0.0409

MME Premium Laptop

Save your answers with

MME Premium

Gold Standard Education

Question 5: Marsha notices that her neighbourhood seems to contain far more blue cars than would be normal. She finds a statistic online that says nationally, around 4\% of cars are blue. She then observes 50 cars near her house and 5 of them are blue.

Construct a hypothesis test for this at the 5\% level and find whether or not Marsha is right that her neighbourhood contains more blue cars.

[6 marks]

A Level AQAEdexcelOCR

X\sim B(50,0.04)

H_{0}: p=0.04

H_{1}: p>0.04

Significance level \alpha=5\%

Observed data: x=5

\mathbb{P}(X\geq 4)=0.0489<0.05

Reject H_{0}. There is sufficient evidence to suggest that there are more blue cars in Marsha’s neighbourhood.

MME Premium Laptop

Save your answers with

MME Premium

Gold Standard Education

Additional Resources

MME

Exam Tips Cheat Sheet

A Level
MME

Formula Booklet

A Level