Base rate neglect

A test for a rare disease is 99% accurate. You test positive. How likely is it that you’re actually sick?

The instinct says ~99%. The real answer, for a disease that afflicts 1% of the population, is closer to 50%.

If the test comes back positive, the patient is sick with probability

16.7%

Sick, tests positive 99

Sick, tests negative 1

Healthy, tests positive 495

Healthy, tests negative 9405

Of 594 positive tests, only 99 are actually sick.

prevalence (P sick) 1.0% sensitivity (true +) 99% specificity (true −) 95%

Each dot is a person in a population of 10000. The visible area of yellow (false positives) almost always exceeds red (true positives) when the disease is rare.

Why so low?

Because the healthy population is enormous relative to the sick one. Out of 10 000 people:

100 are sick. The 99% sensitive test catches 99 of them — those are the true positives.
9 900 are healthy. The 99% specific test misidentifies 1% of them as positive — that’s 99 false positives.

Among the 198 people who tested positive, only 99 are actually sick. $99/198 = 50%$ .

P (sick ∣ +) = \frac{P ( + ∣ sick ) \cdot P ( sick )}{P ( + ∣ sick ) P ( sick ) + P ( + ∣ healthy ) P ( healthy )}

What to try

Drop prevalence to 0.1%. Watch $P (sick ∣ +)$ collapse — you’d need near-perfect specificity for a positive result to mean much.
Lift prevalence to 10%. Now the same test is far more informative — you’re starting from a less surprising prior.
Drag specificity from 95% to 99.9%. Small changes at the high end dramatically improve the positive predictive value; that’s why medical screening often involves a confirmatory test.

Why it matters

Base rates are why screening everyone for rare conditions creates a flood of false positives even with good tests. It’s why security alert systems cry wolf. It’s why predictive policing models flagging rare events are usually wrong about the individual, even when they’re “accurate” on the aggregate.

The test doesn’t tell you whether you’re sick. It updates the prior you brought in. If your prior was low, even a strong update often leaves you below 50%.