Confidence intervals

A 95% confidence interval for the mean isn’t “the range where μ lies with 95% probability.” It’s the range produced by a procedure that, across repeated experiments, covers the true μ in 95% of runs. The interval is random; μ is fixed.

\overset{ˉ}{X} \pm t_{α /2, n - 1} \frac{s}{n}

covers μ (98/100) misses μ (2/100) Empirical coverage: 98.0% (nominal 95%)

Source distribution n (sample size) 30 confidence level % 95 repeated experiments 100

Each row is one experiment of n samples, with its

\overset{x}{ˉ} \pm t_{α /2, n - 1} s / n

interval. A

(1 - α)

CI covers μ roughly

(1 - α)

of the time — reality, not the statement that μ has a 95% chance of sitting inside any one interval.

What to notice

Coverage ≈ stated level. Run 100 experiments from any source distribution with finite variance. Roughly 5 of them produce intervals that miss μ. The count fluctuates with the seed but averages to the nominal rate.
Sample size controls width, not coverage. Raising n shrinks each ribbon but doesn’t change how often they cover μ — that’s determined by the procedure, not the data.
Non-Normal sources still work at small n. The Student’s t interval is remarkably robust. Switch the source to Exponential at n = 10 and coverage is usually within a few points of 95%. It degrades for n < 5 or for genuinely wild distributions.

The standard mistake

Many introductory sources write (or imply):

P (μ \in [this interval]) = 0.95

That statement treats μ as random. In the frequentist framework μ is a fixed unknown — the probability is zero or one for any specific interval. The proper statement is about the long-run behavior of the procedure:

P (μ \in [\overset{ˉ}{X} \pm t_{α /2} s / n]) = 1 - α

The interval is random, μ is fixed. This is the whole philosophical gulf between frequentist and Bayesian inference. A credible interval does satisfy the “probability that μ lies inside” reading — but only by treating μ as a random variable with a prior.

Why it matters

Confidence intervals are the default uncertainty quantifier in nearly every empirical field. Knowing what they mean — and what they don’t — determines whether the reported numbers communicate honest uncertainty or manufactured precision.