# Selecting a Distribution

## Selecting a Distribution Revision

**Selecting a Distribution**

In practice, a question might not tell you which **distribution** to use. Furthermore, you might be given some **data** and asked which **distribution** best fits it. For this, we need to **establish the conditions** for each **distribution**, then when faced with a question, you just pick the **distribution** whose conditions are satisfied.

Make sure you are happy with the following topics before continuing.

**Conditions for a Binomial Distribution**

- The data is
**discrete**. - There is a
**fixed number of trials**(n) which end in**only two outcomes**: success or failure. - The success probability (p) is
**constant**, and**every trial is independent**of each other.

If these conditions are met, use a **binomial distribution**.

**Conditions for a Normal Distribution**

- The data is
**continuous**. - The data is
**symmetrically distributed**(or at least close to symmetrically distributed) with a**peak at the mean**. - The data
**tails off**as you move away from the mean. Specifically, almost all of the data falls**within three standard deviations**of the mean.

If these conditions are met, use a **normal distribution**.

**Example 1: In Practice**

The price of a first class plane ticket from London to Paris varies according to demand. The price is **symmetric about the mean** of £500, it has a standard deviation of £33, and a price lower than £400 or higher than £600 is yet to be recorded. Suggest what **distribution** to use then use it to calculate the probability of being able to fly first class to Paris for under £450.

**[3 marks]**

The data is continuous and symmetric about its mean. Three standard deviations is £99, and data lower than £400 or higher than £600 is yet to be recorded, so virtually all data lies between £401-£599. Hence, we can use a normal distribution.

X\sim N(500,33^{2})

\mathbb{P}(X<450)=0.0649

**Example 2: In Practice**

St. David’s Sixth Form College have a pass rate of 92\%. They have 30 students taking exams this year, and are hoping at least 28 of them pass. What is the probability of this happening?

**[3 marks]**

The data is discrete, there is a set pass or fail probability for each student, which are the only outcomes, and we are not told anything that suggests the events of students passing might not be independent. Hence, we can use a binomial distribution.

X\sim B(30,0.92)

\mathbb{P}(X\geq 28)=0.5654

## Selecting a Distribution Example Questions

**Question 1: **Which distribution would be appropriate for modelling the following random variables?

a) The speed of cars driving past York Minster.

b) The number of times 19 comes up on a roulette wheel in 100 spins.

c) The number of spades in 10 cards selected at random without replacement from a full deck.

**[3 marks]**

a) Normal

Speed is continuous, and this is likely to have an average with most drivers in a small range of the average, fitting the normal distribution assumptions.

b) Binomial

The data is discrete, there is a fixed success probability, and each spin is independent.

c) Neither

The data is discrete so we can rule out normal. But because the cards are chosen without replacement, the probability of selecting a spade changes each time, so we can rule out binomial too.

**Question 2: **Jack monitors the time it takes for his laptop to boot up each morning. He finds a mean of 20 seconds and a standard deviation of 2 seconds, with times less than 14 or greater than 26 seconds very rare. Suggest an appropriate distribution, then find the probability that Jack’s computer will take 16 seconds or less to boot up tomorrow.

**[2 marks]**

Data is continuous, tails off from the mean with almost all data being within three standard deviations of the mean. Hence, we should use a normal distribution.

X\sim N(20,2)

\mathbb{P}(X\leq 16)=0.0228

**Question 3: **When potatoes are harvested, they have to reach a certain standard to be selected for sale. Molly has found that, in previous years, a potato has a probability of 0.7 of being good enough for sale, and that each potato is independent. This year she has harvested 50 potatoes. What is the probability that she can sell at least 40 of them?

**[2 marks]**

The data is discrete; there are two outcomes, with a fixed success probability; and each trial is independent. Therefore we should use a binomial distribution.

X\sim B(50,0.7)

\mathbb{P}(X\geq 40)=0.0789