Regression to the mean

Take the tallest children in a class. Their parents tend to be taller than average — but not as tall as the children themselves. Measure the same children again next week; their scores will drift toward the class mean. This is regression to the mean, and it requires no explanation beyond probability.

−4−2024−4−3−2−101234parent (standardized)child (standardized)
Top-quartile mean X
+1.30σ
Their children's mean Y
+0.89σ
pulled toward zero by factor r
Orange dots: top-quartile parents (X > +0.67σ). Red dot: their children's mean. Blue line: regression y = r·x. Even with perfect heredity, extreme parents produce children closer to average — not because of any force, but pure probability.

What to notice

  • Orange dots are parents in the top quartile (X > +0.67σ). Red dot is their children’s mean.
  • The red dot always lies below the orange cluster — children’s mean Y = r·(mean X of parents) < mean X of parents, since r < 1.
  • r near 1: strong inheritance — children’s mean barely moves. r near 0: children’s mean collapses to zero regardless of parents.
  • The blue regression line captures the effect exactly.

Why it happens

The math is simple: in a bivariate normal with correlation r, the conditional expectation is . For any r < 1, extreme values of X predict less-extreme values of Y. There’s no force pulling children toward mediocrity — the effect is entirely a consequence of imperfect correlation.

The practical danger

Regression to the mean produces many false causal stories:

  • A student scores unusually low on a test, gets tutoring, scores higher — tutoring gets the credit.
  • A sports team has an exceptional season, regresses the next year — the coach gets blamed.
  • A company tries an unusual intervention when sales are worst, sales recover — intervention gets the credit.

In each case the change would have happened without any intervention. Galton named the phenomenon in 1886 studying human height. The word “regression” — now used for all linear prediction — comes from his description of this pull.