Laplace distribution

Also called the double exponential. Take two mirrored exponential tails and glue them at the mode: that’s Laplace. It’s the log-density that makes L1 regression (LASSO) a Bayesian MAP estimate with a sparsity-encouraging prior.

f (x; μ, b) = \frac{1}{2 b} exp (- \frac{∣ x - μ ∣}{b})

Laplace Normal, same variance

μ (location) 0.0 b (scale) 1.0 n (samples) 1000

PDF

f (x; μ, b) = \frac{1}{2 b} exp (- \frac{∣ x - μ ∣}{b})

. The cusp at

μ

and the heavier tails are what set it apart from the matched-variance Normal.

What to notice

A sharp cusp at $μ$ . Unlike the Normal’s smooth peak, the density has a kink where the two exponential tails meet. That non-differentiability is what drives L1 solutions to exact zeros.
Heavier tails than the matched Normal. Decay is exponential in $∣ x - μ ∣$ instead of squared, so the same standard deviation implies much more probability far from centre.
Median is the MLE for $μ$ , not the mean — which is why L1 regression is the natural tool when outliers are abundant:

\overset{μ}{^}_{MLE} = median (x_{1}, \dots, x_{n})

Why it matters

The Laplace is the max-entropy distribution over the real line given a fixed mean absolute deviation, the same way the Normal is max-entropy given fixed variance. That’s the deep reason the two show up in different regularization regimes: L1 if deviations are measured absolutely, L2 if squared.