Confidence intervals

A 95% confidence interval for the mean isn’t “the range where μ lies with 95% probability.” It’s the range produced by a procedure that, across repeated experiments, covers the true μ in 95% of runs. The interval is random; μ is fixed.

true μ = 0sample mean ± 95% CI, 100 repeated experiments
covers μ (98/100) misses μ (2/100) Empirical coverage: 98.0% (nominal 95%)
Each row is one experiment of n samples, with its interval. A CI covers μ roughly of the time — reality, not the statement that μ has a 95% chance of sitting inside any one interval.

What to notice

  • Coverage ≈ stated level. Run 100 experiments from any source distribution with finite variance. Roughly 5 of them produce intervals that miss μ. The count fluctuates with the seed but averages to the nominal rate.
  • Sample size controls width, not coverage. Raising n shrinks each ribbon but doesn’t change how often they cover μ — that’s determined by the procedure, not the data.
  • Non-Normal sources still work at small n. The Student’s t interval is remarkably robust. Switch the source to Exponential at n = 10 and coverage is usually within a few points of 95%. It degrades for n < 5 or for genuinely wild distributions.

The standard mistake

Many introductory sources write (or imply):

That statement treats μ as random. In the frequentist framework μ is a fixed unknown — the probability is zero or one for any specific interval. The proper statement is about the long-run behavior of the procedure:

The interval is random, μ is fixed. This is the whole philosophical gulf between frequentist and Bayesian inference. A credible interval does satisfy the “probability that μ lies inside” reading — but only by treating μ as a random variable with a prior.

Why it matters

Confidence intervals are the default uncertainty quantifier in nearly every empirical field. Knowing what they mean — and what they don’t — determines whether the reported numbers communicate honest uncertainty or manufactured precision.