Nimo

Interpret scatter graphs and correlation

StatisticsStatistics

Flashcards

Test your knowledge with interactive flashcards

What is extrapolation and its limitation?

Click to reveal answer

Extrapolation estimates beyond observed data; it carries high risk because the existing pattern may not continue outside the data range.

Key concepts

What you'll likely be quizzed about

Definition of scatter graphs and bivariate data

A scatter graph (scatterplot) plots bivariate data: two linked numerical variables recorded for the same items or individuals. Each plotted point uses one coordinate from each variable, typically written as (x, y). Bivariate data analysis focuses on the relationship between the two variables. Clear axis labels and consistent scales are essential for accurate reading and comparison.

Axes, variables and plotting points

The horizontal axis (x-axis) typically shows the independent or explanatory variable; the vertical axis (y-axis) shows the dependent or response variable. Each pair of measurements becomes one point on the grid. Accurate plotting requires correct units and equal intervals. Mislabelled axes or uneven scales can distort the apparent relationship between variables.

Direction and form of correlation

Direction describes whether y tends to increase or decrease as x increases. A positive correlation arises when points slope upwards; a negative correlation arises when points slope downwards. No correlation arises when points show no clear trend. Form describes whether the relationship approximates a straight line (linear) or a curve (non-linear). Linear form allows summarising with a straight line; non-linear form requires a different model or description.

Strength of correlation and visual assessment

Strength measures how closely the points cluster around an imagined line or curve. Strong correlation shows points close to a clear line; weak correlation shows widely scattered points. A moderate correlation falls between these extremes. Outliers reduce perceived strength and can mislead assessment. Clustering in subgroups can hide or exaggerate a relationship; careful inspection is necessary before drawing conclusions.

Line of best fit and estimation

A line of best fit (trend line) summarises a linear relationship by fitting a straight line through the points so that it represents the central tendency. The line enables interpolation: estimating y for x values inside the data range. Extrapolation uses the line to estimate outside the observed range and carries increasing uncertainty. The method of fitting (by eye or by calculation) affects accuracy; least-squares methods produce a precise fit but require calculation.

Correlation versus causation and limitations

Correlation indicates association but not cause. Two variables can correlate because one causes the other, because the causation runs the other way, because both are caused by a third (lurking) variable, or because of coincidence. Conclusions about causation require controlled experiments, temporal evidence, or additional justification. Observational scatter graphs alone cannot establish causal links; recognition of confounding factors and outliers reduces misinterpretation.

Key notes

Important points to keep in mind

Label axes clearly and include units before interpreting a scatter graph.

Determine direction (positive/negative/none), strength (strong/moderate/weak) and form (linear/non-linear).

Use a line of best fit for linear trends; treat extrapolation with caution.

Outliers can distort the apparent relationship and should be investigated.

Correlation describes association but does not prove one variable causes the other.

Consider lurking variables or reverse causation before concluding causality.

Subgroups in the data can mask or create misleading overall trends.

Interpolation inside the data range is more reliable than extrapolation.

Visual judgement is useful; calculated fits (e.g., least squares) provide more precise lines.

Always check axis scales for distortion before comparing scatter graphs.

Built with v0