TutorChase logo
IB DP Maths AA HL Study Notes

4.1.1 Measures of Central Tendency

Measures of central tendency are fundamental in statistics, providing a snapshot of the main characteristics of data. They offer a single value that describes how a group of data clusters around a central value. The three primary measures of central tendency are the mean, median, and mode. Understanding these measures is essential for performing more complex statistical analyses, such as regression analysis.

Mean

The mean, often referred to as the average, is the sum of all the numbers in a data set divided by the number of numbers in that set.

Properties:

  • Influence of Every Value: The mean is affected by every value in the data set, including outliers. This means that extreme values can skew the mean.
  • Change with Data: If the data set changes, the mean will also change.
  • Usefulness: The mean is a useful measure when all data points are similar and when there are no significant outliers. It plays a critical role when dealing with continuous random variables.

Applications:

  • Exams: Calculating the average score in exams.
  • Economics: Determining the average income in a particular region.
  • Meteorology: Finding the average temperature over a week.

Example Question: If the maths scores for a class of five students were 85, 89, 90, 92, and 95, what is the mean score?

Solution: Mean = (85 + 89 + 90 + 92 + 95) / 5 = 451 / 5 = 90.2

Median

The median is the middle number in a data set when the numbers are arranged in order. If there's an even number of observations, the median is the average of the two middle numbers.

Properties:

  • Unaffected by Outliers: Unlike the mean, the median is not affected by outliers or extreme values.
  • Data Division: It divides the data set into two equal halves.
  • Usefulness: The median is a useful measure when there are outliers in the data or when the data is skewed. It is particularly helpful in understanding distributions that are not well represented by the mean alone, similar to when we explore the basic concepts of probability.

Applications:

  • Real Estate: Determining the middle value in a set of house prices.
  • Demographics: Finding the median age in a population survey.

Example Question: Find the median of the following set of numbers: 3, 7, 8, 5, 12, 14, 21, 13, 18.

Solution: Arrange the numbers in ascending order: 3, 5, 7, 8, 12, 13, 14, 18, 21. The median is the fifth number, which is 12.

Mode

The mode is the number that appears most frequently in a data set. A data set may have one mode, more than one mode, or no mode at all.

Properties:

  • Multiple Modes: A data set can have multiple modes if several numbers appear with the same highest frequency.
  • Absence of Mode: If no number in the data set is repeated, then there is no mode.
  • Usefulness: The mode is a useful measure when we want to know the most common item or value. It is especially relevant in discrete distributions, such as the binomial distribution.

Applications:

  • Retail: Determining the most common shoe size sold in a store.
  • Medicine: Finding the most frequently occurring blood type in a hospital.

Example Question: Find the mode of the following set of numbers: 4, 7, 8, 8, 9, 9, 9, 10, 10.

Solution: The number 9 appears three times, which is more frequent than any other number, so the mode is 9.

For a deeper understanding of how measures of central tendency relate to variability within data sets, you may find it useful to explore topics like expected value and variance.

FAQ

The mode is distinct from the mean and median in that it represents the most frequently occurring value in a data set. Unlike the mean and median, which are based on the actual values and their positions, the mode is purely about frequency. A data set can have more than one mode (bimodal or multimodal) if multiple values appear with the same highest frequency. Additionally, it's possible for a data set to have no mode if no value is repeated. The mode is particularly useful in situations where we want to identify the most common category or value in a data set.

Yes, a data set can have more than one mode. When a data set has two modes, it is referred to as "bimodal". If it has more than two modes, it is termed "multimodal". This occurrence indicates that there are multiple values in the data set that have the same highest frequency of occurrence. For instance, in a data set of numbers like [3, 4, 5, 5, 6, 6, 7], both 5 and 6 are modes, making the data set bimodal. Multimodal data sets can provide insights into multiple popular or common categories within the data.

The median is often more appropriate than the mean in situations where the data set has outliers or is skewed. Since the median is the middle value when the data is arranged in order, it remains unaffected by extreme values, unlike the mean. For instance, in a scenario where we're looking at incomes in a community and there are a few extremely high earners, the mean income might be significantly higher than what most people earn. In such cases, the median provides a more accurate representation of the central income. Similarly, in distributions that are not symmetrical, the median can offer a clearer picture of the central tendency.

The mean, while being a commonly used measure of central tendency, can sometimes be misleading, especially when the data set contains outliers. Outliers are extreme values that are much larger or smaller than the other values in the data set. Since the mean is calculated by summing all the values and dividing by the number of values, the presence of even a single outlier can significantly skew the mean, making it not truly representative of the central value of the data. In such cases, the median, which is unaffected by outliers, might be a more appropriate measure of central tendency.

Each measure of central tendency provides a different perspective on the data set, and relying on just one might give an incomplete or even misleading picture. The mean gives an arithmetic average, but can be skewed by outliers. The median provides a middle value, offering a clearer picture in skewed distributions or those with outliers. The mode indicates the most frequently occurring value, highlighting the most common category or value. By considering all three measures, one can get a comprehensive understanding of the data's distribution, its central values, and its spread. This holistic approach ensures a more accurate and nuanced interpretation of the data.

Practice Questions

A group of students took a maths test, and their scores were as follows: 56, 58, 60, 62, 64, 66, 68, 70, 72, and 74. Calculate the mean and median of their scores.

To find the mean, we sum up all the scores and divide by the number of scores. Mean = (56 + 58 + 60 + 62 + 64 + 66 + 68 + 70 + 72 + 74) / 10 = 650 / 10 = 65.

To find the median, since there are 10 scores (an even number), we take the average of the 5th and 6th scores. Median = (64 + 66) / 2 = 130 / 2 = 65.

Thus, the mean and median scores are both 65.

In a small town, the ages of 7 residents are: 21, 23, 25, 29, 29, 30, and 32. Determine the mode of their ages.

The mode is the number that appears most frequently in a data set. From the given ages, the number 29 appears twice, which is more frequent than any other age. Therefore, the mode of their ages is 29.

Hire a tutor

Please fill out the form and we'll find a tutor for you.

1/2
About yourself
Alternatively contact us via
WhatsApp, Phone Call, or Email