Ferkans — Interactive Telecom Tutor

Beyond the First Moment

Markov's inequality uses only the mean. If we also know the variance, we can say much more. The idea is simple: apply Markov to the non-negative random variable $(X - \mu)^2$ , which encodes how far $X$ deviates from its mean. The result is Chebyshev's inequality --- a two-sided tail bound that depends only on the first two moments.

Theorem: Chebyshev's Inequality

Let $X$ be a random variable with mean $\mu = \mathbb{E}[X]$ and finite variance $\text{Var}(X) = \sigma^2$ . Then for every $\epsilon > 0$ , $\mathbb{P}(|X - \mu| \geq \epsilon) \leq \frac{\sigma^2}{\epsilon^2}.$ Equivalently, setting $\epsilon = k\sigma$ for $k > 0$ : $\mathbb{P}(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}.$

At most a fraction $1/k^2$ of the probability mass can lie more than $k$ standard deviations from the mean --- regardless of the shape of the distribution. For $k = 2$ , at most 25% of the mass lies beyond $\mu \pm 2\sigma$ ; for $k = 3$ , at most 11%. Compare this with the Gaussian, where $\mathbb{P}(|Z| \geq 3) \approx 0.27\%$ .

Proof

Define a non-negative RV

Let $Z = (X - \mu)^2$ . Then $Z \geq 0$ and $\mathbb{E}[Z] = \sigma^2$ .

Apply Markov

For $\epsilon > 0$ , applying Markov's inequality (TMarkov's Inequality) to $Z$ with threshold $\epsilon^2$ : $\mathbb{P}(Z \geq \epsilon^2) \leq \frac{\mathbb{E}[Z]}{\epsilon^2} = \frac{\sigma^2}{\epsilon^2}.$

Translate back

The event $\{Z \geq \epsilon^2\}$ is the same as $\{(X - \mu)^2 \geq \epsilon^2\} = \{|X - \mu| \geq \epsilon\}$ . Therefore $\mathbb{P}(|X - \mu| \geq \epsilon) \leq \sigma^2/\epsilon^2$ .

,

Example: Chebyshev for the Uniform Distribution

Let $X \sim \text{Uniform}(0, 1)$ with $\mu = 1/2$ and $\sigma^2 = 1/12$ . Use Chebyshev to bound $\mathbb{P}(|X - 1/2| \geq 1/4)$ and compare with the exact value.

Solution

Chebyshev bound

$\mathbb{P}(|X - 1/2| \geq 1/4) \leq \frac{1/12}{(1/4)^2} = \frac{1/12}{1/16} = \frac{4}{3}.$

The bound exceeds 1, so it is useless in this case. Chebyshev only becomes informative when $\epsilon > \sigma$ .

Exact value

$\mathbb{P}(|X - 1/2| \geq 1/4) = \mathbb{P}(X \leq 1/4) + \mathbb{P}(X \geq 3/4) = 1/4 + 1/4 = 1/2.$

Lesson

Chebyshev is useful when $\epsilon$ is several standard deviations from the mean. For $\epsilon = 2\sigma = 2/\sqrt{12} \approx 0.577$ , we get $\mathbb{P}(|X - 1/2| \geq 0.577) \leq 1/4$ , and the exact answer is 0 (since $X \in [0,1]$ ).

Definition:
Cantelli's Inequality (One-Sided Chebyshev)

For a random variable $X$ with mean $\mu$ and variance $\sigma^2$ , and any $a > 0$ : $\mathbb{P}(X - \mu \geq a) \leq \frac{\sigma^2}{\sigma^2 + a^2}.$ This is tighter than the naive $\sigma^2/(2a^2)$ obtained by halving the two-sided Chebyshev bound.

Cantelli's inequality is proved by applying Markov to $(X - \mu + t)^2$ for an optimally chosen $t > 0$ .

Chebyshev Bound vs. Exact Tail

Compare the Chebyshev bound $\sigma^2/\epsilon^2$ with the exact tail probability $\mathbb{P}(|X - \mu| \geq \epsilon)$ for various distributions.

Parameters

Distribution

\sigma

1

Common Mistake: Chebyshev Treats All Distributions as Worst-Case

Mistake:

Using Chebyshev's inequality when the distribution is known to be light-tailed (e.g., Gaussian or bounded) and concluding that the tails are heavy.

Correction:

Chebyshev must accommodate the worst-case distribution with the given mean and variance --- which is a two-point mass. For light-tailed distributions, the Chernoff bound (Section 10.3) or Hoeffding's inequality (Section 10.4) give exponentially tighter bounds.

Chebyshev and the Weak Law of Large Numbers

Chebyshev's inequality gives the simplest proof of the Weak Law of Large Numbers. For i.i.d. $X_1, \ldots, X_n$ with mean $\mu$ and variance $\sigma^2$ , the sample mean $\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i$ has $\text{Var}(\bar{X}_n) = \sigma^2/n$ . Chebyshev gives: $\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) \leq \frac{\sigma^2}{n\epsilon^2} \to 0$ as $n \to \infty$ . This is convergence in probability.

Quick Check

If $X$ has mean $\mu$ and standard deviation $\sigma$ , what does Chebyshev's inequality say about $\mathbb{P}(|X - \mu| \geq 3\sigma)$ ?

$\leq 1/3$

$\leq 1/9$

$\leq 1/6$

$\leq 1/27$

Correction:

\leq 1/9

Chebyshev gives $1/k^2 = 1/9$ for $k = 3$ .

Historical Note: Pafnuty Chebyshev and the Method of Moments

1867

Pafnuty Lvovich Chebyshev (1821--1894) proved his inequality around 1867 as part of a broader program to develop the method of moments for studying convergence of distributions. His student Markov later isolated the simpler first-moment version. Chebyshev's approach --- bounding tail probabilities using moment information --- became a cornerstone of probability theory and remains the first tool we reach for when exact distributions are unavailable.

Concentration inequality

A bound on the probability that a random variable deviates from some value (typically its mean). Markov, Chebyshev, Chernoff, and Hoeffding are all concentration inequalities, with increasing tightness.

Related: {{Ref:Gloss Tail Probability}}

Chebyshev's Inequality