Data & Probability

Intro to Probability

What are the chances? That question comes up all the time — from predicting weather to figuring out your odds of winning a game. Probability gives us a way to measure how likely something is, and when we graph it, beautiful shapes emerge. Let’s explore two of the most important ones.

Probability as Area

Here is a key idea that connects probability to graphs: the probability of an outcome is the area under a curve. The total area under any probability distribution always equals 1 (meaning 100% chance that something happens).

Connection

Think of it this way: if you drop a ball onto a distribution curve, the area under any section tells you the chance the ball lands in that region. More area = more likely.

The Normal Distribution (The Bell Curve)

The normal distribution is the most famous shape in all of statistics. It shows up everywhere — test scores, heights, measurement errors, and more.

It is defined by two numbers:

Mean (mu) — the center of the bell, where the peak sits
Standard deviation (sigma) — how spread out the data is

\text0 = \frac0{\sigma\sqrt{2\pi}} \, e^{-\frac{(x - \mu)^2}{2\sigma^2}}

Use the sliders to reshape the bell curve:

Mean (mu)0

-55

Std Dev (sigma)1

0.33

\mu = 0, \quad \sigma = 1

Try This

Experiment with these:

Slide mu left and right — the whole bell slides with it. The mean is the center!
Increase sigma — the bell gets wider and shorter. The data is more “spread out.”
Decrease sigma toward 0.3 — the bell gets tall and narrow. The data clusters tightly around the mean.
Notice: no matter what you do, the total area under the curve stays at 1.

The 68-95-99.7 Rule

For any normal distribution:

68% of values fall within 1 standard deviation of the mean
95% fall within 2 standard deviations
99.7% fall within 3 standard deviations

This is why the bell curve is so useful — once you know the mean and standard deviation, you can predict where almost all the data will land.

Comparing Bell Curves

Here are three normal distributions with different standard deviations, all centered at zero. Watch how sigma controls the shape:

The smaller the standard deviation, the taller and narrower the peak. A small sigma means the data is very consistent. A large sigma means lots of variation.

Connection

Imagine three classes taking the same test. The class with sigma = 0.5 had very similar scores (everyone studied about the same). The class with sigma = 2.0 had scores all over the place — some aced it, some didn’t. Same average, very different spreads.

The Binomial Distribution

The binomial distribution answers a different question: if you repeat an experiment n times, and each trial has a p probability of success, what is the chance of getting exactly k successes?

Think of flipping a coin n times — how many heads will you get?

P(k) = \binom00 \, p^k \, (1-p)^{n-k}

We can approximate the binomial distribution with a smooth curve. Adjust the number of trials (n) and the probability of success (p):

Trials (n)10

140

Probability (p)0.5

0.010.99

n = 10, \quad p = 0.5, \quad \text0 = np = 10 \cdot 0.5

Try This

Experiment with these:

Set p = 0.5 (fair coin) and increase n — the curve gets wider and more symmetric. More flips = more spread in the results.
Keep n = 20 and slide p from 0.1 to 0.9 — watch the peak shift! When p is small, most outcomes cluster near zero. When p is large, they cluster near n.
Set p = 0.5 and n = 1 — the curve is very wide. With just one trial, anything can happen. Now slide n up to 40 — predictability increases!

When Does Binomial Look Normal?

As n gets large, the binomial distribution starts to look like a normal distribution! This is called the Central Limit Theorem — one of the most powerful ideas in all of statistics.

The approximation works best when both np and n(1-p) are at least 5.

Connection

This is actually what we plotted above — a normal approximation of the binomial distribution, with mean = np and standard deviation = sqrt(np(1-p)). Go back and set n = 30, p = 0.5 — it looks almost perfectly bell-shaped!

Side by Side: Shifting the Mean vs. Changing the Spread

Let’s put it all together. Here are two normal curves — one you control the mean of, and the other you control the spread of:

Blue Mean-2

-44

Red Std Dev0.8

0.33

Challenge

Challenge: Can you make the two curves overlap perfectly? Think about what values of the blue mean and red standard deviation would make them identical. Hint: the red curve is centered at x = 2 and the blue has sigma = 1.

Key Takeaways

Probability = area under the curve. The total area is always 1.
The normal distribution is defined by its mean (center) and standard deviation (spread).
The binomial distribution counts successes in repeated trials, controlled by n (trials) and p (probability).
As n grows, the binomial distribution approaches a normal distribution — that is the Central Limit Theorem.
Changing the mean shifts the curve left or right. Changing the standard deviation makes it wider or narrower.

Take the Quiz