Laplace distribution

Also called the double exponential. Take two mirrored exponential tails and glue them at the mode: that’s Laplace. It’s the log-density that makes L1 regression (LASSO) a Bayesian MAP estimate with a sparsity-encouraging prior.

00.10.20.30.40.5−8−6−4−202468xdensity
Laplace Normal, same variance
PDF . The cusp at and the heavier tails are what set it apart from the matched-variance Normal.

What to notice

  • A sharp cusp at . Unlike the Normal’s smooth peak, the density has a kink where the two exponential tails meet. That non-differentiability is what drives L1 solutions to exact zeros.
  • Heavier tails than the matched Normal. Decay is exponential in instead of squared, so the same standard deviation implies much more probability far from centre.
  • Median is the MLE for , not the mean — which is why L1 regression is the natural tool when outliers are abundant:

Why it matters

The Laplace is the max-entropy distribution over the real line given a fixed mean absolute deviation, the same way the Normal is max-entropy given fixed variance. That’s the deep reason the two show up in different regularization regimes: L1 if deviations are measured absolutely, L2 if squared.