Nimo

Histograms and Cumulative Frequency for Grouped Data

StatisticsStatistics

Flashcards

Test your knowledge with interactive flashcards

How to check a histogram plotted with frequency density

Click to reveal answer

Confirm that the sum of bar areas equals the total frequency.

Key concepts

What you'll likely be quizzed about

Grouped data and class intervals

Grouped data places raw values into class intervals and records the frequency in each interval. The class interval gives range boundaries and the frequency gives the count of observations inside that range. Limiting factors include loss of exact values and dependence of summaries on chosen interval endpoints and widths, so estimates from grouped data are approximate.

Histogram definition and purpose

A histogram displays grouped continuous data using adjacent bars; each bar spans a class interval on the horizontal axis and has height proportional to frequency or frequency density. The area of each bar represents the number (or proportion) of observations in that interval. Histograms show distribution shape, skewness and modes, and guide decisions about central tendency and spread.

Histograms with equal class widths

When class widths are equal, bar heights can represent frequency directly because equal widths make area proportional to height. Cause: equal widths produce equal horizontal scaling; effect: bar heights relate directly to frequencies. Construction steps include marking class boundaries on the horizontal axis, drawing adjacent bars for each class, and setting bar heights equal to class frequency.

Histograms with unequal class widths and frequency density

Unequal class widths cause bar areas to misrepresent frequency if heights equal frequency. Cause: varying widths change area even when heights are constant; effect: areas no longer match frequencies. Frequency density fixes this by defining density = frequency ÷ class width. Bar heights equal frequency density so that area = (frequency density × width) = frequency. Limiting factor: units on the vertical axis become 'frequency per unit width', which requires clear labelling.

Constructing histograms step-by-step

Choose class boundaries and record frequencies for each interval. If widths vary, calculate frequency density for each class. Draw a horizontal axis with class boundaries and adjacent vertical bars spanning those boundaries. Label the vertical axis with frequency or frequency density as appropriate. Check that the sum of bar areas equals total frequency to confirm correct scaling.

Cumulative frequency graphs (ogives)

A cumulative frequency graph plots cumulative totals against class boundaries to show how frequencies accumulate across classes. Cause: cumulative plotting sums frequencies progressively; effect: the resulting curve allows estimation of medians, quartiles and percentiles by reading values at specified cumulative counts. Construction uses upper class boundaries (or lower boundaries consistently), plots cumulative frequencies at those boundaries, and joins points with a smooth or straight line. Limiting factor: interpolation assumes uniform distribution inside classes, so estimates are approximate.

Reading median and quartiles from an ogive

The median corresponds to the value at which cumulative frequency equals half the total frequency. Quartiles correspond to cumulative frequencies equal to 25% and 75% of the total. Cause: cumulative frequency ranks data from smallest to largest; effect: horizontal lines at those cumulative counts meet the ogive and project to the horizontal axis to give estimated values. The accuracy depends on class widths and the assumption of uniform distribution within classes.

Appropriate uses and limitations

Histograms suit continuous data grouped into intervals and reveal distribution shape; cumulative frequency graphs suit estimation of medians, quartiles and percentiles. Cause: histograms show local frequency density while ogives show accumulated totals; effect: chosen diagram depends on whether the purpose is distribution shape or percentile estimation. Limitations include loss of detail due to grouping, sensitivity to the choice of class boundaries and widths, and approximation errors from assuming uniform distribution within classes.

Key notes

Important points to keep in mind

Grouped data gives approximate summaries; exact values are not recoverable from aggregated classes.

Use frequency density when class widths differ so bar area equals frequency.

When class widths are equal, bar heights may equal frequency and vertical axis can be frequency.

Cumulative frequency graphs use cumulative totals and allow median and quartile estimation by interpolation.

For ogive construction, use consistent class boundaries (upper or lower) and label which boundaries are used.

Check histogram scaling by verifying total area equals total frequency.

Interpolation on cumulative frequency assumes uniform distribution inside each class, producing approximate results.

Clearly label axes: include units, class boundaries, and indicate 'frequency density' when used.

Choice of class width and starting point affects histogram shape and estimated statistics.

Histograms represent continuous grouped data; bar charts represent categorical data and should not be used for continuous distributions.

Built with v0