The Law of Large Numbers and Central Limit Theorem
The Two Great Limit Theorems
The law of large numbers (LLN) and the central limit theorem (CLT) are the most important results in probability theory. The LLN says that sample averages converge to the true mean; the CLT says how: the fluctuations around the mean are Gaussian with standard deviation .
Both theorems have elegant proofs via characteristic functions. The strategy is the same: compute the CF of the (normalized) partial sum, show it converges pointwise to a known CF, and invoke the Levy continuity theorem.
Definition: Convergence in Distribution
Convergence in Distribution
A sequence of random variables converges in distribution to , written , if
at every continuity point of .
Convergence in distribution is the weakest mode of convergence. It does not require the to be defined on the same probability space. In particular, (a constant) is equivalent to (convergence in probability).
Theorem: Levy Continuity Theorem
Let be a sequence of random variables with CFs .
(a) If with CF , then for all .
(b) Conversely, if exists for all and is continuous at , then is a valid CF of some CDF , and where .
Convergence in distribution is equivalent to pointwise convergence of CFs, provided the limit is continuous at the origin. This is the bridge that allows us to prove limit theorems by working entirely in the transform domain.
Forward direction (sketch)
. Since is bounded and continuous, and in distribution, the Helly-Bray theorem gives .
Converse (sketch)
The sequence is tight (since is continuous at , mass cannot escape to infinity). By Prohorov's theorem, every subsequence has a further subsequence converging to some CDF . By the forward direction, the CF of must be . By uniqueness, all subsequential limits agree, so .
Theorem: The Law of Large Numbers (via CF)
Let be i.i.d. with finite mean and partial sum . Then
The sample average concentrates around the true mean . As grows, the CF of converges to , which is the CF of the constant . A degenerate (constant) limit in distribution implies convergence in probability.
Compute the CF of $Z_n = S_n/n$
By the scaling property and independence: .
Taylor-expand the CF
Since , Theorem TMoments from the Characteristic Function gives: as .
Substituting : .
Take the $n$-th power
n \to \infty\mu$.
Apply the continuity theorem
is continuous at , so by Theorem TLevy Continuity Theorem, .
Theorem: The Central Limit Theorem (via CF)
Let be i.i.d. with mean and variance . Then
After centering and scaling, the distribution of the partial sum approaches a Gaussian. The CF of the standardized sum converges to , the CF of the standard Gaussian. This is because the higher cumulants () scale as relative to , so they vanish in the limit.
Standardize
Let with , . Let denote the common CF of the .
Taylor-expand $\phi_Y$
Since : .
CF of the standardized sum
has CF
Take the limit
n \to \infty\mathcal{N}(0, 1)$.
Conclude
By the Levy continuity theorem, .
The CLT in Action: CF Convergence to Gaussian
Watch how the characteristic function of the standardized sum converges to the Gaussian CF as increases. The real part (top) and imaginary part (bottom) are shown.
Parameters
Visualizing the Central Limit Theorem
Historical Note: The Long Road to the Central Limit Theorem
18th-20th centuryThe CLT has a rich history spanning three centuries. De Moivre (1733) proved the first version for Bernoulli trials. Laplace (1810) extended it using generating functions. Chebyshev (1887) attempted a proof via moments, which Markov (1898) completed. The modern proof via characteristic functions, clean and general, is due to Levy (1925) and Lindeberg (1922). Lindeberg's condition β the most general sufficient condition for the CLT β removes the identical distribution requirement, needing only that no single summand dominates.
The CLT is arguably the most practically important theorem in all of mathematics: it explains why the Gaussian distribution appears so ubiquitously in nature and engineering.
Common Mistake: The CLT Is About the Limit, Not the Rate
Mistake:
Assuming that the CLT implies a good Gaussian approximation for small (e.g., ). The CLT says nothing about the quality of the approximation for finite .
Correction:
The Berry-Esseen theorem quantifies the rate: the CDF error is bounded by where . The constant (Shevtsova, 2011). For heavy-tailed distributions or asymmetric distributions, convergence can be slow. Always check with simulations or Berry-Esseen before trusting the Gaussian approximation for moderate .
Quick Check
In the CLT proof via CFs, what is the key property of the Gaussian CF that ensures the limit is well-defined?
It is continuous at , satisfying the Levy continuity theorem
It is bounded above by
It is the only CF that is real-valued
It is analytic on the whole real line
The Levy continuity theorem requires that the limiting function be continuous at the origin. Since is smooth everywhere (and equals at ), the theorem applies and guarantees convergence in distribution.