Statistics

Mean, Median & the Shape of Data

When you hear “average,” most people think of one number. But statisticians have several ways to describe the middle of a dataset — and they don’t always agree. Let’s see why that matters.

Part 1: The Mean — Add Them Up, Divide

The mean (arithmetic average) is probably what you learned first:

\text0 = \frac{\text{sum of all values}}{\text{number of values}}

Think of it as the balance point of the data. If you placed each data point as a weight on a number line, the mean is where the line would balance.

Drag the sliders below to place five data points and watch the mean move:

Point 13

020

Point 25

020

Point 37

020

Point 48

020

Point 59

020

\text0 = \frac{ 3 + 5 + 7 + 8 + 9 }0

Each bump is a data point. The pink spike marks the mean. Drag the sliders and watch the points and mean move together on the number line.

Try This

Try this: Keep four points clustered near 5, then drag Point 5 all the way to 20. Watch how much the mean shifts! One extreme value can pull the mean far from where most of the data lives. This is why the mean is called sensitive to outliers.

Part 2: The Median — The Middle Value

The median is the value right in the middle when you sort the data from smallest to largest. Half the data falls below it, half above.

For five points, the median is simply the 3rd value after sorting.

Connection

Mean vs. Median: When data is symmetric (evenly spread), the mean and median are close together. When data is skewed (pulled to one side by extreme values), they separate. The median resists outliers — it stays put even if one value is extreme. That’s why things like household income are often reported as median instead of mean.

Part 3: Symmetric vs. Skewed Distributions

Not all datasets are shaped the same. The shape of the data tells a story.

Normal (Symmetric) Distribution

A bell curve is the classic symmetric shape. The mean and median sit at the center together:

Center (mean)0

-55

Spread (std dev)1.5

0.54

Drag the center slider to shift the whole bell left or right. Drag the spread slider to make it wider (more spread out) or narrower (more concentrated).

Right-Skewed Distribution

In a right-skewed distribution, a long tail stretches to the right. This happens when a few very large values pull the data out:

Think of incomes: most people earn moderate amounts, but a few billionaires pull the tail way out to the right. In right-skewed data, the mean is greater than the median because the mean gets dragged toward the tail.

Left-Skewed Distribution

A left-skewed distribution has a long tail to the left:

Think of exam scores where most students do well but a few score very low. Here the mean is less than the median.

Try This

Rule of thumb:

Symmetric: Mean is approximately equal to Median
Right-skewed: Mean > Median (tail pulls mean right)
Left-skewed: Mean < Median (tail pulls mean left)

Part 4: Spread — How Far Apart Is the Data?

Two datasets can have the same mean but very different spreads. Compare these two bell curves:

Spread of Curve A0.8

0.33

Spread of Curve B2

0.33

Both curves are centered at zero, but the wider curve represents data with more variability. The standard deviation measures this spread — bigger standard deviation means data is more spread out from the mean.

Challenge

Challenge: Imagine two classes both scored a mean of 75 on a test. Class A has a standard deviation of 5, and Class B has a standard deviation of 15. Which class had more consistent scores? (Use the sliders above to visualize the answer!)

Wrapping Up

Concept	What It Tells You
Mean	The balance point — sensitive to outliers
Median	The middle value — resistant to outliers
Symmetric	Mean and median are close
Skewed	Mean gets pulled toward the tail
Standard deviation	How spread out the data is

Understanding these ideas is the foundation of statistics. Whenever you see a dataset, ask yourself: Where is the center? How spread out is it? Is it symmetric or skewed? Those three questions tell you most of the story.

Take the Quiz