Normal Approximations to the Binomial Distribution
Normal Approximations to the Binomial Distribution Revision
Normal Approximations to the Binomial Distribution
In some cases, a binomial distribution can be approximated by a normal distribution. This can be useful as binomial distributions with large n can be difficult to work with.
The approximation is:
\color{red}X\sim B(n,p)\color{grey}\approx\color{blue}Y\sim N(np,np(1-p))
Make sure you are happy with the following topics before continuing.
Continuity Correction
An obvious problem with this approximation is that the binomial distribution is discrete while the normal distribution is continuous. This means that the binomial distribution takes fixed values with certain probabilities, but the normal distribution only takes values on ranges, i.e.
- For discrete distributions, such as binomial, we can work out \color{red}\mathbb{P}(X=0),\mathbb{P}(X=1) and so on.
- For continuous distributions, such as normal, \color{blue}\mathbb{P}(Y=0)=\mathbb{P}(Y=1)=0, and we can only work out probabilities of ranges of values.
This means that, for our approximation, we need continuity correction, which works like this:
\color{red}\mathbb{P}(X=a)\color{grey}=\color{blue}\mathbb{P}(a-0.5<Y<a+0.5)
For example, \color{red}\mathbb{P}(X=1)\color{grey}=\color{blue}\mathbb{P}(0.5<Y<1.5)\color{grey},\color{red}\mathbb{P}(X=2)\color{grey}=\color{blue}\mathbb{P}(1.5<Y<2.5) and so on.
This table shows continuity correction in practice:
Conditions for the Approximation
You can only use the approximation under some circumstances. You must make sure the conditions hold before you use the approximation. You can use the approximation when:
p\approx 0.5 and n is large
OR
both np>5 and n(1-p)>5
Example 1: When n is Large
X\sim B(250,0.55). Find \mathbb{P}(X\leq 130).
[2 marks]
n=250 which is large, and p=0.55 which is close to 0.5, so we can use the approximation.
Y\sim N(250\times 0.55,250\times 0.55\times (1-0.55))
Y\sim N(137.5,61.875)
\mathbb{P}(X\leq 130)=\mathbb{P}(Y<130.5) (continuity correction)
\mathbb{P}(X\leq 130)=0.1702
Example 2: When np and n(1-p) are greater than 5
X\sim B(20,0.7). Find \mathbb{P}(X<15).
[2 marks]
n=20 and p=0.7
20\times 0.7=14>5 and 20\times (1-0.7)=6>5, so we can use the approximation.
Y\sim N(20\times 0.7,20\times 0.7\times (1-0.7))
Y\sim N(14,4.2)
\mathbb{P}(X<15)=\mathbb{P}(Y<14.5) (continuity correction)
\mathbb{P}(X<15)=0.5964
Normal Approximations to the Binomial Distribution Example Questions
Question 1: For which of these binomial distributions could we use a normal approximation?
i) X\sim B(135,0.5)
ii) X\sim B(6,0.3)
iii) X\sim B(21,0.25)
iv) X\sim B(2000,0.001)
[5 marks]
i) n is large and p=0.5, so we can use the approximation.
ii) n is not large, and np=1.8<5, so neither condition is met so we cannot use the approximation.
iii) n is not large, but np=5.25>5 and n(1-p)=15.75>5, so we can use the approximation.
iv) n is large but p is not close to 0.5 so we must check np and n(1-p). np=2000\times 0.001=2<5. Hence, we cannot use the approximation.
Question 2: X\sim B(150,0.4). Find:
i) \mathbb{P}(X<60)
ii) \mathbb{P}(X\leq 66)
iii) \mathbb{P}(40\leq X\leq 75)
[4 marks]
n=150,p=0.4
n is large and p is close to 0.5 so we can use the approximation.
Y\sim N(np,np(1-p))
Y\sim N(150\times 0.4,150\times 0.4\times (1-0.4))
Y\sim N(60,36)
i) \mathbb{P}(X<60)=\mathbb{P}(Y<59.5)=0.4668
ii) \mathbb{P}(X\leq 66)=\mathbb{P}(Y<66.5)=0.8606
i) \mathbb{P}(40\leq X\leq 75)=\mathbb{P}(39.5<Y<75.5)=0.9947
Question 3: Every day, the probability that John buys a chocolate bar is \dfrac{12}{25}. What is the probability that he buys more than 200 chocolate bars in a (non-leap) year?
[3 marks]
We can model this with X\sim B\left(365,\dfrac{12}{25}\right).
n=365 which is large and p=\dfrac{12}{25}=0.48 which is close to 0.5, so we can use the approximation.
Y\sim N(np,np(1-p))
Y\sim N(365\times 0.48,365\times 0.48\times 0.52)
Y\sim N(175.2,91.104)
\mathbb{P}(X>200)=\mathbb{P}(Y>200.5)
\mathbb{P}(X>200)=0.3906
Question 4: (Harder) Find the largest value of x such that \mathbb{P}(X<x)<0.1 where X\sim B(1000,0.6)
(Hint: you will need to use the standard normal distribution.)
[4 marks]
n=1000 which is large and p=0.6 which is close to 0.5 so we can use the normal approximation.
Y\sim N(np,np(1-p))
Y\sim N(1000\times 0.6,1000\times 0.6\times 0.4)
Y\sim N(600,240)
\mathbb{P}(X<x)<0.1
\mathbb{P}(Y<x-0.5)=0.1
Convert to standard normal Z:
\mathbb{P}\left(Z<\dfrac{x-0.5-600}{\sqrt{240}}\right)=0.1
Use percentage points table:
\dfrac{x-0.5-600}{\sqrt{240}}=-1.2816
x-0.5-600=-1.2816\sqrt{240}
x-600.5=-1.2816\sqrt{240}
x=600.5-1.2816\sqrt{240}
x=580.65
Hence, the highest whole number x such that \mathbb{P}(X<x)<0.1 is 580