Mean, Median & the Shape of Data
When you hear “average,” most people think of one number. But statisticians have several ways to describe the middle of a dataset — and they don’t always agree. Let’s see why that matters.
Part 1: The Mean — Add Them Up, Divide
The mean (arithmetic average) is probably what you learned first:
Think of it as the balance point of the data. If you placed each data point as a weight on a number line, the mean is where the line would balance.
Drag the sliders below to place five data points and watch the mean move:
Each bump is a data point. The pink spike marks the mean. Drag the sliders and watch the points and mean move together on the number line.
Try this: Keep four points clustered near 5, then drag Point 5 all the way to 20. Watch how much the mean shifts! One extreme value can pull the mean far from where most of the data lives. This is why the mean is called sensitive to outliers.
Part 2: The Median — The Middle Value
The median is the value right in the middle when you sort the data from smallest to largest. Half the data falls below it, half above.
For five points, the median is simply the 3rd value after sorting.
Mean vs. Median: When data is symmetric (evenly spread), the mean and median are close together. When data is skewed (pulled to one side by extreme values), they separate. The median resists outliers — it stays put even if one value is extreme. That’s why things like household income are often reported as median instead of mean.
Part 3: Symmetric vs. Skewed Distributions
Not all datasets are shaped the same. The shape of the data tells a story.
Normal (Symmetric) Distribution
A bell curve is the classic symmetric shape. The mean and median sit at the center together:
Drag the center slider to shift the whole bell left or right. Drag the spread slider to make it wider (more spread out) or narrower (more concentrated).
Right-Skewed Distribution
In a right-skewed distribution, a long tail stretches to the right. This happens when a few very large values pull the data out:
Think of incomes: most people earn moderate amounts, but a few billionaires pull the tail way out to the right. In right-skewed data, the mean is greater than the median because the mean gets dragged toward the tail.
Left-Skewed Distribution
A left-skewed distribution has a long tail to the left:
Think of exam scores where most students do well but a few score very low. Here the mean is less than the median.
Rule of thumb:
- Symmetric: Mean is approximately equal to Median
- Right-skewed: Mean > Median (tail pulls mean right)
- Left-skewed: Mean < Median (tail pulls mean left)
Part 4: Spread — How Far Apart Is the Data?
Two datasets can have the same mean but very different spreads. Compare these two bell curves:
Both curves are centered at zero, but the wider curve represents data with more variability. The standard deviation measures this spread — bigger standard deviation means data is more spread out from the mean.
Challenge: Imagine two classes both scored a mean of 75 on a test. Class A has a standard deviation of 5, and Class B has a standard deviation of 15. Which class had more consistent scores? (Use the sliders above to visualize the answer!)
Wrapping Up
| Concept | What It Tells You |
|---|---|
| Mean | The balance point — sensitive to outliers |
| Median | The middle value — resistant to outliers |
| Symmetric | Mean and median are close |
| Skewed | Mean gets pulled toward the tail |
| Standard deviation | How spread out the data is |
Understanding these ideas is the foundation of statistics. Whenever you see a dataset, ask yourself: Where is the center? How spread out is it? Is it symmetric or skewed? Those three questions tell you most of the story.