Intro to Probability
What are the chances? That question comes up all the time — from predicting weather to figuring out your odds of winning a game. Probability gives us a way to measure how likely something is, and when we graph it, beautiful shapes emerge. Let’s explore two of the most important ones.
Probability as Area
Here is a key idea that connects probability to graphs: the probability of an outcome is the area under a curve. The total area under any probability distribution always equals 1 (meaning 100% chance that something happens).
Think of it this way: if you drop a ball onto a distribution curve, the area under any section tells you the chance the ball lands in that region. More area = more likely.
The Normal Distribution (The Bell Curve)
The normal distribution is the most famous shape in all of statistics. It shows up everywhere — test scores, heights, measurement errors, and more.
It is defined by two numbers:
- Mean (mu) — the center of the bell, where the peak sits
- Standard deviation (sigma) — how spread out the data is
Use the sliders to reshape the bell curve:
Experiment with these:
- Slide mu left and right — the whole bell slides with it. The mean is the center!
- Increase sigma — the bell gets wider and shorter. The data is more “spread out.”
- Decrease sigma toward 0.3 — the bell gets tall and narrow. The data clusters tightly around the mean.
- Notice: no matter what you do, the total area under the curve stays at 1.
The 68-95-99.7 Rule
For any normal distribution:
- 68% of values fall within 1 standard deviation of the mean
- 95% fall within 2 standard deviations
- 99.7% fall within 3 standard deviations
This is why the bell curve is so useful — once you know the mean and standard deviation, you can predict where almost all the data will land.
Comparing Bell Curves
Here are three normal distributions with different standard deviations, all centered at zero. Watch how sigma controls the shape:
The smaller the standard deviation, the taller and narrower the peak. A small sigma means the data is very consistent. A large sigma means lots of variation.
Imagine three classes taking the same test. The class with sigma = 0.5 had very similar scores (everyone studied about the same). The class with sigma = 2.0 had scores all over the place — some aced it, some didn’t. Same average, very different spreads.
The Binomial Distribution
The binomial distribution answers a different question: if you repeat an experiment n times, and each trial has a p probability of success, what is the chance of getting exactly k successes?
Think of flipping a coin n times — how many heads will you get?
We can approximate the binomial distribution with a smooth curve. Adjust the number of trials (n) and the probability of success (p):
Experiment with these:
- Set p = 0.5 (fair coin) and increase n — the curve gets wider and more symmetric. More flips = more spread in the results.
- Keep n = 20 and slide p from 0.1 to 0.9 — watch the peak shift! When p is small, most outcomes cluster near zero. When p is large, they cluster near n.
- Set p = 0.5 and n = 1 — the curve is very wide. With just one trial, anything can happen. Now slide n up to 40 — predictability increases!
When Does Binomial Look Normal?
As n gets large, the binomial distribution starts to look like a normal distribution! This is called the Central Limit Theorem — one of the most powerful ideas in all of statistics.
The approximation works best when both np and n(1-p) are at least 5.
This is actually what we plotted above — a normal approximation of the binomial distribution, with mean = np and standard deviation = sqrt(np(1-p)). Go back and set n = 30, p = 0.5 — it looks almost perfectly bell-shaped!
Side by Side: Shifting the Mean vs. Changing the Spread
Let’s put it all together. Here are two normal curves — one you control the mean of, and the other you control the spread of:
Challenge: Can you make the two curves overlap perfectly? Think about what values of the blue mean and red standard deviation would make them identical. Hint: the red curve is centered at x = 2 and the blue has sigma = 1.
Key Takeaways
- Probability = area under the curve. The total area is always 1.
- The normal distribution is defined by its mean (center) and standard deviation (spread).
- The binomial distribution counts successes in repeated trials, controlled by n (trials) and p (probability).
- As n grows, the binomial distribution approaches a normal distribution — that is the Central Limit Theorem.
- Changing the mean shifts the curve left or right. Changing the standard deviation makes it wider or narrower.