TutorChase logo
CIE IGCSE Maths Study Notes

9.5.1 Scatter Diagrams and Correlation

Understanding the relationship between two variables is a fundamental aspect of statistical analysis. Scatter diagrams (or scatter plots) are visual representations that show how one variable is affected by another, allowing us to observe patterns and possibly predict future trends. This section focuses on creating scatter diagrams and determining the type of correlation between two variables, such as height and weight, to see if there’s a positive correlation.

Scatter Plots and Correlation Examples

Introduction to Scatter Diagrams

Scatter diagrams plot pairs of numerical data, with one variable on each axis, to look for a relationship between them. When plotting data points on this type of graph, a pattern may emerge, indicating a correlation between the variables.

  • Positive Correlation: As one variable increases, the other also increases.
  • Negative Correlation: As one variable increases, the other decreases.
  • No Correlation: There is no apparent relationship between the variables.
Types of Correlation

Plotting a Scatter Diagram

Step 1: Collect Data

Gather data on two variables you wish to compare. For example, you might collect data on the heights and weights of a group of people.

Step 2: Draw the Axes

Draw two perpendicular lines to represent the axes. Label the horizontal axis for one variable (e.g., height) and the vertical axis for the other (e.g., weight).

Step 3: Plot the Data Points

For each pair of values, plot a point on the graph corresponding to the height and weight of an individual.

Step 4: Observe the Pattern

Look at the plotted points to determine if there is a pattern suggesting a correlation.

Understanding Correlation

  • Strong Correlation: The points lie close to a straight line.
  • Weak Correlation: The points are more spread out from a straight line.
  • No Correlation: The points are very scattered, with no discernible pattern.

Example: Height and Weight Correlation

Let's consider a set of data representing the heights (in metres) and weights (in kilograms) of a group of individuals:

Table of Value for Height and Weight

Plotting the Data

By plotting these points on a scatter diagram, we can visually inspect the relationship between height and weight.

Height vs. Weight Correlation

Observing Correlation

The plotted points are likely to show an upward trend, indicating a positive correlation between height and weight: as height increases, weight tends to increase as well.

Correlation Coefficient

The correlation coefficient, typically denoted as (r), quantifies the degree of correlation between two variables, ranging from -1 to 1.

  • r=1r = 1: Perfect positive correlation
  • r=1r = -1: Perfect negative correlation
  • r=0r = 0: No correlation

Calculating the Correlation Coefficient (r)(r)

The correlation coefficient, rr, helps quantify the strength and direction of a linear relationship between two variables.

Formula for rr:

r=n(xy)(x)(y)[nx2(x)2][ny2(y)2]r = \dfrac{n(\sum xy) - (\sum x)(\sum y)}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}}

Where:

  • nn is the number of pairs,
  • xx and yy are the individual scores,
  • \sum denotes the sum over all pairs.

Calculation:

1. List all xx (height) and yy (weight) pairs.

2. Calculate x\sum x, y\sum y, xy\sum xy, x2\sum x^2, and y2\sum y^2.

Given our dataset, let's perform these calculations to find rr.

Let's start by calculating the sums and sums of squares needed for the formula.

After calculating the necessary components, we find:

  • x=8.5\sum x = 8.5 (total height)
  • y=332\sum y = 332 (total weight)
  • xy=567.5\sum xy = 567.5 (sum of the product of heights and weights)
  • x2=14.475\sum x^2 = 14.475 (sum of squared heights)
  • y2=22434\sum y^2 = 22434 (sum of squared weights)
r=5(567.5)(8.5)(332)[5(14.475)(8.5)2][5(22434)(332)2]0.994r = \frac{5(567.5) - (8.5)(332)}{\sqrt{[5(14.475) - (8.5)^2][5(22434) - (332)^2]}} \approx 0.994

Using these values in our correlation coefficient formula gives us:

r=5(567.5)(8.5)(332)[5(14.475)(8.5)2][5(22434)(332)2]0.994r = \frac{5(567.5) - (8.5)(332)}{\sqrt{[5(14.475) - (8.5)^2][5(22434) - (332)^2]}} \approx 0.994

This result, r0.994r \approx 0.994, indicates a very strong positive correlation between height and weight in our dataset. A correlation coefficient close to 1 signifies that as height increases, weight also increases in a linear manner.

Hire a tutor

Please fill out the form and we'll find a tutor for you.

1/2
About yourself
Alternatively contact us via
WhatsApp, Phone Call, or Email