Ferkans — Interactive Telecom Tutor

From 'Likely Close' to 'Eventually Always Close'

The WLLN says that for any fixed $\epsilon > 0$ , the probability of a deviation $|\bar{X}_n - \mu| \geq \epsilon$ vanishes. But it leaves open the possibility that for each $\omega$ , there are infinitely many $n$ where $\bar{X}_n(\omega)$ strays far from $\mu$ . The Strong Law eliminates this possibility: for almost every $\omega$ , $\bar{X}_n(\omega)$ eventually stays within $\epsilon$ of $\mu$ and never leaves again. This is a qualitatively stronger guarantee — and it is the version that justifies interpreting probability as long-run relative frequency.

Theorem: Strong Law of Large Numbers (SLLN)

Let $X_1, X_2, \ldots$ be i.i.d. with mean $\mu = \mathbb{E}[X_1]$ (and $\mathbb{E}[|X_1|] < \infty$ ). Then:

$\bar{X}_n \xrightarrow{\text{a.s.}} \mu,$

that is, $\mathbb{P}\!\left(\lim_{n \to \infty} \bar{X}_n = \mu\right) = 1$ .

The WLLN controls $\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon)$ for each fixed $n$ . The SLLN controls the event $\{|\bar{X}_n - \mu| \geq \epsilon \text{ infinitely often}\}$ and shows it has probability zero. The upgrade from "for each $n$ " to "for all $n$ simultaneously" requires stronger tools — the Borel-Cantelli lemma is the key.

Proof

Setup: finite fourth moment case

We prove the SLLN under the stronger assumption $\mathbb{E}[X_1^4] < \infty$ . The full proof under only $\mathbb{E}[|X_1|] < \infty$ requires truncation arguments (see Durrett, Ch. 2).

Let $Y_i = X_i - \mu$ so that $\mathbb{E}[Y_i] = 0$ and $\bar{X}_n - \mu = \frac{1}{n}\sum_{i=1}^n Y_i$ .

Bound the fourth moment of $S_n = \sum_{i=1}^n Y_i$

Expand $\mathbb{E}[S_n^4] = \mathbb{E}\!\left[\left(\sum_{i=1}^n Y_i\right)^4\right]$ . The sum expands into terms $\mathbb{E}[Y_i Y_j Y_k Y_\ell]$ . Since the $Y_i$ are i.i.d. and zero-mean:

Terms with any index appearing exactly once vanish (e.g., $\mathbb{E}[Y_i Y_j^2 Y_k] = \mathbb{E}[Y_i]\mathbb{E}[Y_j^2]\mathbb{E}[Y_k] = 0$ ).
Surviving terms: $\mathbb{E}[Y_i^4]$ (index repeated 4 times, $n$ terms) and $\mathbb{E}[Y_i^2 Y_j^2]$ (two indices each repeated twice, $\binom{n}{2} \cdot 6$ terms from the 6 pairings of 4 positions).

Therefore: $\mathbb{E}[S_n^4] = n\mathbb{E}[Y_1^4] + 3n(n-1)\sigma^4 \leq Cn^2$ for some constant $C$ depending on $\mathbb{E}[Y_1^4]$ and $\sigma^2$ .

Apply Markov's inequality to the fourth moment

By Markov's inequality applied to $(\bar{X}_n - \mu)^4 = S_n^4/n^4$ :

$\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) \leq \frac{\mathbb{E}[S_n^4]}{n^4 \epsilon^4} \leq \frac{C}{n^2 \epsilon^4}.$

Sum over $n$ and apply Borel-Cantelli

Since $\sum_{n=1}^{\infty} \frac{C}{n^2 \epsilon^4} < \infty$ , the first Borel-Cantelli lemma gives:

$\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon \text{ i.o.}) = 0.$

This holds for every $\epsilon > 0$ . Taking $\epsilon = 1/k$ and a countable intersection over $k = 1, 2, \ldots$ :

$\mathbb{P}\!\left(\lim_{n \to \infty} \bar{X}_n = \mu\right) = 1. \quad \blacksquare$

,

The Full Proof Under Finite Mean Only

The proof above used the finite fourth moment to get $\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) = O(1/n^2)$ , which is summable. Kolmogorov's proof of the SLLN under only $\mathbb{E}[|X_1|] < \infty$ uses a truncation argument: replace $X_i$ by $\tilde{X}_i = X_i \mathbf{1}_{|X_i| \leq i}$ , show that the truncated and original sums agree eventually, then apply the Borel-Cantelli lemma to the truncated sum. This is considerably more technical, and we refer the interested reader to Durrett (2019), Chapter 2.

Example: Relative Frequency Interpretation of Probability

Let $A$ be an event with $\mathbb{P}(A) = p$ , and let $X_i = \mathbf{1}_A$ for the $i$ -th independent repetition. Show that the relative frequency $\bar{X}_n$ converges to $p$ almost surely.

Solution

Identify the setup

The $X_i$ are i.i.d. $\text{Bernoulli}(p)$ with $\mathbb{E}[X_i] = p$ . The sample mean $\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i$ equals the relative frequency of $A$ in $n$ trials.

Apply SLLN

By the SLLN: $\bar{X}_n \xrightarrow{\text{a.s.}} p$ .

This is the mathematical justification for the frequentist interpretation of probability: the long-run relative frequency of an event equals its probability, with probability one.

Example: When the Distinction Matters: Infinite Horizon Guarantees

A communication system transmits i.i.d. symbols and we monitor the running error rate $\hat{P}_e(n) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}_{\{X_i \neq \hat{X}_i\}}$ . Explain why the SLLN gives a stronger guarantee than the WLLN for long-term system operation.

Solution

WLLN guarantee

The WLLN says: for any fixed time horizon $n$ , $\mathbb{P}(|\hat{P}_e(n) - P_e| \geq \epsilon) \to 0$ . But it does not control the probability that $\hat{P}_e(n)$ ever exceeds $P_e + \epsilon$ during continuous operation.

SLLN guarantee

The SLLN says: with probability 1, there exists a random time $N(\omega)$ such that $|\hat{P}_e(n) - P_e| < \epsilon$ for all $n \geq N(\omega)$ .

For a system that runs indefinitely (like a base station), this is the right guarantee: the observed error rate will eventually settle near the true error rate and stay there — not just be likely to be close at any given instant.

Historical Note: From Bernoulli to Kolmogorov

1713–1930

The earliest version of the law of large numbers is due to Jacob Bernoulli (1713), who proved a weak version for Bernoulli trials in his posthumous Ars Conjectandi. Poisson (1835) named it the "loi des grands nombres." The strong version was first proved by Emile Borel (1909) for the special case of fair coin flips, and by Francesco Cantelli (1917) in greater generality. The definitive result — SLLN under only finite mean — was established by Kolmogorov (1930) using his truncation method. The proof we present, using the fourth moment and Borel-Cantelli, is a pedagogical compromise: cleaner than Kolmogorov's but under a stronger hypothesis.

Quick Check

In the finite-fourth-moment proof of the SLLN, why do we need $\mathbb{E}[S_n^4]$ rather than just $\mathbb{E}[S_n^2]$ ?

Because the fourth moment gives a tighter bound

Because $\sum_{n=1}^{\infty} 1/n$ diverges but $\sum_{n=1}^{\infty} 1/n^2$ converges

Because the fourth moment always exists for i.i.d. sequences

Because the second moment proof already gives a.s. convergence

Correction:

Because

\sum_{n=1}^{\infty} 1/n

diverges but

\sum_{n=1}^{\infty} 1/n^2

converges

Using the second moment (Chebyshev) gives $\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) \leq \sigma^2/(n\epsilon^2)$ , and $\sum 1/n = \infty$ — Borel-Cantelli does not apply. The fourth moment gives $O(1/n^2)$ , and $\sum 1/n^2 < \infty$ .

Strong Law of Large Numbers

States that $\bar{X}_n \xrightarrow{\text{a.s.}} \mu$ for i.i.d. $\{X_i\}$ with finite mean. "Strong" refers to almost sure convergence, which is stronger than the "weak" law's convergence in probability.

The Strong Law of Large Numbers