Chebyshev’s inequality

For any distribution with finite variance, the probability of landing standard deviations away from the mean is bounded by :

No assumptions beyond finite variance. No symmetry, no shape requirements.

00.10.20.30.4−6−4−20246xdensity
True = 0.0455
Chebyshev = 0.2500
Bound / truth ratio: 5.5× — tight only for distributions that pack their variance far from the mean.
The two-sided tail probability is bounded above by . Chebyshev is distribution-free — so it has to accommodate the worst case and usually overshoots.

What to notice

  • At k = 2, the bound says ≤ 25%. For the Normal it’s actually 4.6% — loose by a factor of five. For a heavy-tailed rescaled t(3) it’s ~12%. The gap shrinks as distributions get heavier, because Chebyshev has to cover the worst case.
  • Pathological distributions can make Chebyshev tight. A two-point distribution that puts probability on each of and the rest at the mean actually achieves the bound. Chebyshev is the price of distribution-freeness.
  • Below k = 1 the bound is trivial — it exceeds 1, so it tells you nothing. One-sided versions (Cantelli) tighten this somewhat.

Why it matters

Chebyshev is the shortest path from “finite variance” to a weak form of the Law of Large Numbers. Apply it to the sample mean , whose variance is , and you get:

which goes to zero as . That’s convergence in probability — no other ingredient needed.

Proof

Apply Markov’s inequality to :

The squaring trick converts the one-sided non-negativity requirement of Markov into a two-sided concentration bound — the template for nearly every concentration inequality that followed.