# Types of Data

GCSEKS3Level 1-3AQAEdexcelOCRWJEC

## Types of Data

We can classify data in a few different ways. First of all, we can classify data into different types by looking at what form it takes, and then classify these types depending on how it has been collected.

Level 1-3GCSEKS3AQAEdexcelOCRWJEC

## Class 1: Qualitative

Qualitative/categorical data is anything that isn’t a number, for example words. We usually obtain qualitative/categorical data by conducting a survey. Examples include:

• Football team names
• Favourite takeaways
Level 1-3GCSEKS3AQAEdexcelOCRWJEC

## Class 2: Quantitative

Quantitative data is numerical. There are two types of quantitative data, continuous and discrete.

Continuous data can take any numerical value. Examples of continuous data include:

• Height
• Weight
• Running race times

Discrete data can only take certain exact numerical values (typically whole numbers). Examples of discrete data include:

• Number of children
• Shoe size
Level 1-3GCSEKS3AQAEdexcelOCRWJEC

## Primary Data

Primary data is data that you collect first-hand.

Examples:

• Surveying members of the public
• Measuring the heights of your classmates

• You can ensure that the data is relevant
• You can ensure your data sample is reliable

• Collecting data can be very time-consuming
• Paying for resources (e.g. printing questionnaires), and/or participation in the survey means that it can be expensive
Level 1-3GCSEKS3AQAEdexcelOCRWJEC

## Secondary Data

Secondary data is data that has already been collected by someone else.

Examples:

• The results of a questionnaire that have been posted on the internet
• Statistics published in a newspaper

• It takes much less time than collecting data yourself
• Secondary data is either free, or at least much cheaper than collecting the data yourself

• The data available might not be suitable for your purposes
• You don’t know how the data was collected, meaning you can’t be sure that it is representative or fair

Note: It’s important to understand that we can combine the two classifications we’ve seen. In other words, data can be continuous and primary / secondary, or categorical and primary / secondary etc.

Level 1-3GCSEKS3AQAEdexcelOCRWJEC
Level 1-3GCSEKS3AQAEdexcelOCRWJEC

## Example 1: Types of Data

Janet wants to learn some information about Jason. She learns three pieces of new information about him. For each one, state whether the data is discrete, continuous, or categorical.

[3 marks]

a) His hair colour.

His hair colour might be brown, blonde, ginger, but in any case, it’s not a number. Therefore, this is categorical data.

b) How many siblings he has.

The number of siblings Jason has can only be a whole number. Therefore, this is discrete data.

c) How fast he can solve a Rubik’s cube.

How fast he can solve the puzzle is given as a time so the result could be any value. Therefore, it is continuous data.

Note: Time is continuous, even through your stop watch may only count to the nearest hundredth of a second for example.

Level 1-3GCSEKS3AQAEdexcelOCRWJEC

## Example 2: Types of Data

Chidi wants to gather some data on people’s favourite food. He decides to use a survey he found online that was conducted $10$ years earlier where $200$ people were asked if their favourite food was Italian, Chinese, Indian, or, Thai.

[3 marks]

a) State which two of the following words describes the data Chidi is using:

primary,         secondary,          categorical,         discrete,         continuous.

The data is regarding people’s favourite type of food.  Since this is not numerical, it must therefore be categorical. Secondly, he is using data that was collected by someone else so this is secondary data.

b) State one advantage and one disadvantage of Chidi choosing to use this type of data.

• One advantage of Chidi using this data is that he saves a lot of time compared to collecting it himself
• One disadvantage is that Chidi doesn’t know how the data was collected, so it might not be representative

Additionally, in this case the data is $10$ years old and people’s preferences might have changed a lot in that time. Furthermore, the survey he found only gave people $4$ choices – it is missing many other options.

Level 1-3GCSEKS3AQAEdexcelOCRWJEC

## Types of Data Example Questions

Question 1: State whether the data for the following is categorical, discrete or continuous:

a)  The heights of $12$ dogs.

b)  The lengths of $15$ snakes.

c)  The eye colours of students in a class.

d)  The number of goals scored by members of the school’s football team.

[4 marks]

Level 1-3GCSEKS3 AQAEdexcelOCRWJEC

a)  Since the heights of dogs can be of any value (including numbers that are not whole numbers), this data is continuous.

b)   Since the lengths of snakes can be of any value (including numbers that are not whole numbers), this data is continuous.

c)  Since the data collected will be in the form of words (blue, green, brown etc.), this data is categorical.

d)  Since goals can only be counted in whole numbers, this data is discrete.

Gold Standard Education

Question 2: Eleanor is measuring the length of everyone in her class’s hair. State:

a)   Whether this data is primary or secondary.

b)   Whether this data is categorical, discrete, or continuous.

[2 marks]

Level 1-3GCSEKS3 AQAEdexcelOCRWJEC

a) Since Eleanor is measuring the data herself, the data is primary.

b) She is measuring hair length, which can have any value, including values that are not whole numbers, so the data is continuous.

Gold Standard Education

Question 3: Tahani says:

“People’s shoe sizes are based on the length of their feet, and since length is continuous, shoe size must also be continuous.” Explain why Tahani is wrong.

[1 mark]

Level 1-3GCSEKS3 AQAEdexcelOCRWJEC

Tahani is wrong because although a shoe size is based on foot length, the length of a person’s foot can be of any value, whereas shoe sizes have limited values ($5$, $5$ and a half, $6$, $6$ and a half etc.).

Gold Standard Education

Question 4: Michael wants to collect information from families in his town about the number of children they have. He chooses to question people directly to obtain this data.

a)  State whether his data will be primary or secondary.

b)  Give two advantages of choosing to use this type of data.

[3 marks]

Level 1-3GCSEKS3 AQAEdexcelOCRWJEC

a)  Since Michael is collecting the data himself, it is primary data.

b) By collecting the data himself, he can ensure that the numbers are all accurately recorded.

A second advantage is that he can make efforts to make sure his sample is representative (he can ask people of different genders, races, ages, etc.).  If he was using secondary data, he would have no control over who was being asked.

Note: other correct advantages are acceptable.

Gold Standard Education

Question 5:  Steve wants to obtain data from his $30$ classmates about the performance of the striker, Harry Kane, in a recent match. Half of them are allowed to choose from the following six options:

• “The worst performance I have ever witnessed from any player ever!”
• “A below average performance.”
• “Not his fault the team lost.”
• “I wish he could play like that every week!”
• “No player in the world could have performed better than that!”

The other half of the class are asked to give him a rating out of $10$.

a)  Is the data that Steve obtains from the first set of data categorical, discrete quantitative or continuous quantitative data?

b)  Is the data that Steve obtains from the second set of data qualitative, discrete quantitative or continuous quantitative data?

c)  State two disadvantages for collecting data qualitatively in this example.

[4 marks]

Level 1-3GCSEKS3 AQAEdexcelOCRWJEC

a)  Since the data that Steve collects from the first half of the class is worded data, this is categorical data.

b)  Since the data that Steve collects from the second of the class is a number, this is quantitative data.  Since the data can only take certain values (numbers between $1$ and $10$), the data is discrete quantative data.

c)  The first disadvantage of collecting data in this way is that it is harder to analyse.  It is much easier to analyse numerical data than worded data.

The second disadvantage is that there are only $6$ options for the worded responses, whereas there are ten options for numbered responses between $0$ and $10$.