The Strong Law of Large Numbers
From 'Likely Close' to 'Eventually Always Close'
The WLLN says that for any fixed , the probability of a deviation vanishes. But it leaves open the possibility that for each , there are infinitely many where strays far from . The Strong Law eliminates this possibility: for almost every , eventually stays within of and never leaves again. This is a qualitatively stronger guarantee — and it is the version that justifies interpreting probability as long-run relative frequency.
Theorem: Strong Law of Large Numbers (SLLN)
Let be i.i.d. with mean (and ). Then:
that is, .
The WLLN controls for each fixed . The SLLN controls the event and shows it has probability zero. The upgrade from "for each " to "for all simultaneously" requires stronger tools — the Borel-Cantelli lemma is the key.
Setup: finite fourth moment case
We prove the SLLN under the stronger assumption . The full proof under only requires truncation arguments (see Durrett, Ch. 2).
Let so that and .
Bound the fourth moment of $S_n = \sum_{i=1}^n Y_i$
Expand . The sum expands into terms . Since the are i.i.d. and zero-mean:
- Terms with any index appearing exactly once vanish (e.g., ).
- Surviving terms: (index repeated 4 times, terms) and (two indices each repeated twice, terms from the 6 pairings of 4 positions).
Therefore: for some constant depending on and .
Apply Markov's inequality to the fourth moment
By Markov's inequality applied to :
Sum over $n$ and apply Borel-Cantelli
Since , the first Borel-Cantelli lemma gives:
This holds for every . Taking and a countable intersection over :
The Full Proof Under Finite Mean Only
The proof above used the finite fourth moment to get , which is summable. Kolmogorov's proof of the SLLN under only uses a truncation argument: replace by , show that the truncated and original sums agree eventually, then apply the Borel-Cantelli lemma to the truncated sum. This is considerably more technical, and we refer the interested reader to Durrett (2019), Chapter 2.
Example: Relative Frequency Interpretation of Probability
Let be an event with , and let for the -th independent repetition. Show that the relative frequency converges to almost surely.
Identify the setup
The are i.i.d. with . The sample mean equals the relative frequency of in trials.
Apply SLLN
By the SLLN: .
This is the mathematical justification for the frequentist interpretation of probability: the long-run relative frequency of an event equals its probability, with probability one.
Example: When the Distinction Matters: Infinite Horizon Guarantees
A communication system transmits i.i.d. symbols and we monitor the running error rate . Explain why the SLLN gives a stronger guarantee than the WLLN for long-term system operation.
WLLN guarantee
The WLLN says: for any fixed time horizon , . But it does not control the probability that ever exceeds during continuous operation.
SLLN guarantee
The SLLN says: with probability 1, there exists a random time such that for all .
For a system that runs indefinitely (like a base station), this is the right guarantee: the observed error rate will eventually settle near the true error rate and stay there — not just be likely to be close at any given instant.
Historical Note: From Bernoulli to Kolmogorov
1713–1930The earliest version of the law of large numbers is due to Jacob Bernoulli (1713), who proved a weak version for Bernoulli trials in his posthumous Ars Conjectandi. Poisson (1835) named it the "loi des grands nombres." The strong version was first proved by Emile Borel (1909) for the special case of fair coin flips, and by Francesco Cantelli (1917) in greater generality. The definitive result — SLLN under only finite mean — was established by Kolmogorov (1930) using his truncation method. The proof we present, using the fourth moment and Borel-Cantelli, is a pedagogical compromise: cleaner than Kolmogorov's but under a stronger hypothesis.
Quick Check
In the finite-fourth-moment proof of the SLLN, why do we need rather than just ?
Because the fourth moment gives a tighter bound
Because diverges but converges
Because the fourth moment always exists for i.i.d. sequences
Because the second moment proof already gives a.s. convergence
Using the second moment (Chebyshev) gives , and — Borel-Cantelli does not apply. The fourth moment gives , and .
Strong Law of Large Numbers
States that for i.i.d. with finite mean. "Strong" refers to almost sure convergence, which is stronger than the "weak" law's convergence in probability.