The Strong Law of Large Numbers

From 'Likely Close' to 'Eventually Always Close'

The WLLN says that for any fixed ϵ>0\epsilon > 0, the probability of a deviation Xˉnμϵ|\bar{X}_n - \mu| \geq \epsilon vanishes. But it leaves open the possibility that for each ω\omega, there are infinitely many nn where Xˉn(ω)\bar{X}_n(\omega) strays far from μ\mu. The Strong Law eliminates this possibility: for almost every ω\omega, Xˉn(ω)\bar{X}_n(\omega) eventually stays within ϵ\epsilon of μ\mu and never leaves again. This is a qualitatively stronger guarantee — and it is the version that justifies interpreting probability as long-run relative frequency.

Theorem: Strong Law of Large Numbers (SLLN)

Let X1,X2,X_1, X_2, \ldots be i.i.d. with mean μ=E[X1]\mu = \mathbb{E}[X_1] (and E[X1]<\mathbb{E}[|X_1|] < \infty). Then:

Xˉna.s.μ,\bar{X}_n \xrightarrow{\text{a.s.}} \mu,

that is, P ⁣(limnXˉn=μ)=1\mathbb{P}\!\left(\lim_{n \to \infty} \bar{X}_n = \mu\right) = 1.

The WLLN controls P(Xˉnμϵ)\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) for each fixed nn. The SLLN controls the event {Xˉnμϵ infinitely often}\{|\bar{X}_n - \mu| \geq \epsilon \text{ infinitely often}\} and shows it has probability zero. The upgrade from "for each nn" to "for all nn simultaneously" requires stronger tools — the Borel-Cantelli lemma is the key.

,

The Full Proof Under Finite Mean Only

The proof above used the finite fourth moment to get P(Xˉnμϵ)=O(1/n2)\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) = O(1/n^2), which is summable. Kolmogorov's proof of the SLLN under only E[X1]<\mathbb{E}[|X_1|] < \infty uses a truncation argument: replace XiX_i by X~i=Xi1Xii\tilde{X}_i = X_i \mathbf{1}_{|X_i| \leq i}, show that the truncated and original sums agree eventually, then apply the Borel-Cantelli lemma to the truncated sum. This is considerably more technical, and we refer the interested reader to Durrett (2019), Chapter 2.

Example: Relative Frequency Interpretation of Probability

Let AA be an event with P(A)=p\mathbb{P}(A) = p, and let Xi=1AX_i = \mathbf{1}_A for the ii-th independent repetition. Show that the relative frequency Xˉn\bar{X}_n converges to pp almost surely.

Example: When the Distinction Matters: Infinite Horizon Guarantees

A communication system transmits i.i.d. symbols and we monitor the running error rate P^e(n)=1ni=1n1{XiX^i}\hat{P}_e(n) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}_{\{X_i \neq \hat{X}_i\}}. Explain why the SLLN gives a stronger guarantee than the WLLN for long-term system operation.

Historical Note: From Bernoulli to Kolmogorov

1713–1930

The earliest version of the law of large numbers is due to Jacob Bernoulli (1713), who proved a weak version for Bernoulli trials in his posthumous Ars Conjectandi. Poisson (1835) named it the "loi des grands nombres." The strong version was first proved by Emile Borel (1909) for the special case of fair coin flips, and by Francesco Cantelli (1917) in greater generality. The definitive result — SLLN under only finite mean — was established by Kolmogorov (1930) using his truncation method. The proof we present, using the fourth moment and Borel-Cantelli, is a pedagogical compromise: cleaner than Kolmogorov's but under a stronger hypothesis.

Quick Check

In the finite-fourth-moment proof of the SLLN, why do we need E[Sn4]\mathbb{E}[S_n^4] rather than just E[Sn2]\mathbb{E}[S_n^2]?

Because the fourth moment gives a tighter bound

Because n=11/n\sum_{n=1}^{\infty} 1/n diverges but n=11/n2\sum_{n=1}^{\infty} 1/n^2 converges

Because the fourth moment always exists for i.i.d. sequences

Because the second moment proof already gives a.s. convergence

Strong Law of Large Numbers

States that Xˉna.s.μ\bar{X}_n \xrightarrow{\text{a.s.}} \mu for i.i.d. {Xi}\{X_i\} with finite mean. "Strong" refers to almost sure convergence, which is stronger than the "weak" law's convergence in probability.

Related: Weak Law of Large Numbers, Almost Sure Convergence