Histograms and Cumulative Frequency for Grouped Data
Statistics • Statistics
Flashcards
Test your knowledge with interactive flashcards
Key concepts
What you'll likely be quizzed about
Grouped data and class intervals
Grouped data places raw values into class intervals and records the frequency in each interval. The class interval gives range boundaries and the frequency gives the count of observations inside that range. Limiting factors include loss of exact values and dependence of summaries on chosen interval endpoints and widths, so estimates from grouped data are approximate.
Histogram definition and purpose
A histogram displays grouped continuous data using adjacent bars; each bar spans a class interval on the horizontal axis and has height proportional to frequency or frequency density. The area of each bar represents the number (or proportion) of observations in that interval. Histograms show distribution shape, skewness and modes, and guide decisions about central tendency and spread.
Histograms with equal class widths
When class widths are equal, bar heights can represent frequency directly because equal widths make area proportional to height. Cause: equal widths produce equal horizontal scaling; effect: bar heights relate directly to frequencies. Construction steps include marking class boundaries on the horizontal axis, drawing adjacent bars for each class, and setting bar heights equal to class frequency.
Histograms with unequal class widths and frequency density
Unequal class widths cause bar areas to misrepresent frequency if heights equal frequency. Cause: varying widths change area even when heights are constant; effect: areas no longer match frequencies. Frequency density fixes this by defining density = frequency ÷ class width. Bar heights equal frequency density so that area = (frequency density × width) = frequency. Limiting factor: units on the vertical axis become 'frequency per unit width', which requires clear labelling.
Constructing histograms step-by-step
Choose class boundaries and record frequencies for each interval. If widths vary, calculate frequency density for each class. Draw a horizontal axis with class boundaries and adjacent vertical bars spanning those boundaries. Label the vertical axis with frequency or frequency density as appropriate. Check that the sum of bar areas equals total frequency to confirm correct scaling.
Cumulative frequency graphs (ogives)
A cumulative frequency graph plots cumulative totals against class boundaries to show how frequencies accumulate across classes. Cause: cumulative plotting sums frequencies progressively; effect: the resulting curve allows estimation of medians, quartiles and percentiles by reading values at specified cumulative counts. Construction uses upper class boundaries (or lower boundaries consistently), plots cumulative frequencies at those boundaries, and joins points with a smooth or straight line. Limiting factor: interpolation assumes uniform distribution inside classes, so estimates are approximate.
Reading median and quartiles from an ogive
The median corresponds to the value at which cumulative frequency equals half the total frequency. Quartiles correspond to cumulative frequencies equal to 25% and 75% of the total. Cause: cumulative frequency ranks data from smallest to largest; effect: horizontal lines at those cumulative counts meet the ogive and project to the horizontal axis to give estimated values. The accuracy depends on class widths and the assumption of uniform distribution within classes.
Appropriate uses and limitations
Histograms suit continuous data grouped into intervals and reveal distribution shape; cumulative frequency graphs suit estimation of medians, quartiles and percentiles. Cause: histograms show local frequency density while ogives show accumulated totals; effect: chosen diagram depends on whether the purpose is distribution shape or percentile estimation. Limitations include loss of detail due to grouping, sensitivity to the choice of class boundaries and widths, and approximation errors from assuming uniform distribution within classes.
Key notes
Important points to keep in mind