# Statistical Sampling

## Statistical Sampling Revision

**Statistical Sampling**

**Statistical sampling** describes the methods by which a **sample** is selected for study in a **statistical investigation**.

The following topics are referred to in this page.

**Terms in Statistical Sampling**

There are several terms used in statistical sampling that you need to know:

**Population:**The group of people or things you wish to find out about.**Finite population:**A**population**where you can count how many members there are.**Infinite population:**A**population**where it is impossible to determine how many members there are (note this may not be infinite in theory, but just impossible to count, such as the number of stars in the universe).**Survey:**The method by which you find out about the population (such as observing behaviour or asking questions).**Census:**A survey involving all of the**population**.**Sample:**A small section of the**population**that have been chosen for a survey.**Sampling unit:**Individual member of the**population**.**Sampling frame:**A full list of the sampling units in the**population**– the**sample**will be selected from this list.

**Representative Sample**

Since we use a **sample** to make inferences about the whole **population**, the sample must **represent** the **population**. If a **sample** is not **representative**, it is called** biased**, and the errors that led to it are known as **sampling bias**.

There are three things to remember to avoid **sampling bias**:

- Sample from the right
**population**, with no members of the**population**excluded from the**sampling frame**. - Select the
**sample**at**random**. - Make sure
**all members of the sample respond**, otherwise the absence of some chosen members could induce**bias**.

More information about creating a **representative sample** can be found on the page **Sampling Techniques**.

**Census or Sample**

How do we know when to use a **census** and when to use a **sample**? Here are the **advantages and disadvantages of both**:

**Advantages of a census**

- The information is accurate since it comes from the whole population.
- Impossible to introduce bias.

**Disadvantages of a census**

- Can involve a lot of time and effort.
- Potentially expensive.
- Difficult to ensure everyone is surveyed – any missed members risk introducing bias.
- If the tested items are used up or damaged by the test, then a census is impractical.

**Advantages of sampling**

- Faster, cheaper and easier than a census.
- The only viable option if testing items uses up or damages them.

**Disadvantages of sampling**

- Each possible sample will give different results, so there is some variation between samples.
- Very easy for there to be sampling bias.

**Example 1: Sweet Shop**

A sweet shop wants to test the quality of a delivery of 10000 sweets. A sweet must be eaten to test its quality.

a) Suggest why a **census** would not be appropriate.

b) What are the **sampling units**?

**[2 marks]**

a) A census would involve eating all of the sweets – in which case the shop would not have any to sell.

b) The sampling units are the individual sweets.

**Example 2: Representative Sample**

Mr. Hobbes wants to put together a randomly chosen student council to help give the student body a voice in decisions at his school. He selects the council at random from the students who are attending today. Why is this not a representative sample?

**[1 mark]**

Not all of the population is included in the sampling pool, because some students could be absent today.

## Statistical Sampling Example Questions

**Question 1: **Would a census or a sample be appropriate in the following circumstances?

a) Finding out the probability of a biased die landing on 6.

b) A baker performing a quality check of their batch of loaves.

c) Finding the average mark in a mock exam of a class of 24 people.

**[3 marks]**

a) Census – there is no difficulty in monitoring every roll of the dice that we take.

b) Sample – A quality check either with implements or consumption could damage the loaf, and the baker has none to sell if all of them become damaged.

c) Census – 24 is a small enough population that it is feasible to work with everyone’s marks.

**Question 2: **Dr. James asks a sample of his patients how much fruit and vegetables they consume in an average week, in order to find out the healthy eating habits of his patients overall.

i) What is the population for the survey?

ii) What sampling frame could Dr. James use?

iii) Suggest one advantage and one disadvantage of using a census over a sample for this survey.

**[4 marks]**

i) The patients

ii) Give each patient a unique number then sample by selecting numbers at random (or use existing metrics, such as NHS number)

iii) An advantage of a census would be that the data is more accurate. A disadvantage would be that it will take more time and effort.

**Question 3: **What are the three things that must be done to avoid sampling bias? If a sample is taken by asking volunteers to fill in a survey, what problem could occur?

**[5 marks]**

- Select from the right population, with no members of said population excluded.
- Select the sample at random.
- Make sure all sample members respond.

Selecting volunteers only violates the second point, as the selection is then not random. This could induce bias because perhaps the people who are likely to volunteer for a survey have other traits in common that are not shared across the whole population.