# Cumulative Frequency and Boxplots

## Cumulative Frequency and Boxplots Revision

**Cumulative Frequency and Boxplots**

A **cumulative frequency** graph can be used to estimate the interquartile range.

A **boxplot** is a graph that shows the median, quartiles, highest/lowest values and outliers.

Make sure you are happy with the following topics before continuing.

**Cumulative Frequency Graphs**

Suppose we have a table showing **classes** and **frequencies**, such as for histograms. We can find the **cumulative frequency** by adding on a column in which we add up the frequencies as we go. To create a **cumulative frequency**** graph**, we plot these cumulative frequencies as y values against the top of each class as x values, then join the points up with straight lines.

To estimate the **interquartile range** from these graphs, we draw lines from the y axis at \dfrac{n}{4} and \dfrac{3n}{4} and find the corresponding x values, which are our quartiles.

**Boxplots**

**Boxplots** are diagrams that look like this:

**Example 1: Cumulative Frequency Graph**

Draw a **cumulative frequency graph** from this data and estimate the** interquartile range**.

**[4 marks]**

**Step 1: **Add a high point column and cumulative frequency column to the table.

**Step 2: **Plot cumulative frequency against high point.

**Step 3: **Draw lines at cumulative frequencies of 25 (first quartile) and 75 (third quartile) and read off the x values.

**Step 4: **The estimate for the interquartile range is 26.5-10=16.5

**Example 2: Boxplots**

Create a **boxplot** for the following data: 1,9,11,12,15,16,18 where a point is considered to be an outlier if it is more than the interquartile range lower than the first quartile or higher than the third quartile.

**[6 marks]**

There are 7 data points.

\dfrac{7}{4}=1.75 so the first quartile is the second data point, which is 9.

\dfrac{3\times 7}{4}=5.25 so the third quartile is the sixth data point, which is 16.

The interquartile range is therefore 16-9=7

So points less than 9-7=2 or greater than 16+7=23 are outliers. So there is one outlier in our data set, at the point 1

The lowest non-outlier value is 9, and the highest non-outlier value is 18

Finally, the median is 12.

## Cumulative Frequency and Boxplots Example Questions

**Question 1: **Create a cumulative frequency graph from the following table:

**[3 marks]**

Add high point and cumulative frequency columns.

Plot high point against cumulative frequency.

**Question 2: **Make a boxplot from the following data:

5,11,17,21,22,24,28,31,32,36\\40,41,44,45,46,48,51,54,58,68

**[5 marks]**

There are 20 data points.

\dfrac{20}{4}=5 so the first quartile is the midpoint between the fifth and sixth point, which is \dfrac{22+24}{2}=23

\dfrac{3\times 20}{4}=15 so the third quartile is the midpoint between the 15th and 16th point, which is \dfrac{46+48}{2}=47

\dfrac{20}{2}=10 so the second quartile (median) is the midpoint between the 10th and 11th point, which is \dfrac{36+40}{2}=38

The lowest and highest values are 5 and 68.

This gives the following box plot:

**Question 3: **Create a box plot by estimating the quartiles from a cumulative frequency graph for the following data:

**[8 marks]**

Add high point and cumulative frequency onto the table.

Plot high point against cumulative frequency on the graph.

Use the graph to determine:

First quartile =1.3

Median =2.6

Third quartile =3.3

Create the box plot.