Ferkans — Interactive Telecom Tutor

Why the Law of Large Numbers Matters

The sample mean $\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i$ is the most basic statistical estimator. Every Monte Carlo simulation, every sample average in a communication receiver, every training loss in machine learning relies on the principle that averaging many independent copies of a random quantity produces something close to the true mean. The Weak Law of Large Numbers makes this precise: $\bar{X}_n$ converges to $\mu$ in probability. The proof is a clean application of Chebyshev's inequality — and the simplicity of the argument is part of its beauty.

Theorem: Weak Law of Large Numbers (WLLN)

Let $X_1, X_2, \ldots$ be i.i.d. random variables with mean $\mu = \mathbb{E}[X_1]$ and finite variance $\sigma^2 = \text{Var}(X_1) < \infty$ . Then the sample mean converges to $\mu$ in probability:

$\bar{X}_n \xrightarrow{P} \mu,$

that is, for every $\epsilon > 0$ :

$\lim_{n \to \infty} \mathbb{P}\!\left(|\bar{X}_n - \mu| \geq \epsilon\right) = 0.$

The variance of $\bar{X}_n$ is $\sigma^2/n$ , which shrinks to zero. Chebyshev's inequality translates vanishing variance into vanishing tail probability. The more samples we average, the tighter the distribution of $\bar{X}_n$ concentrates around $\mu$ .

Proof

Compute the variance of the sample mean

Since the $X_i$ are i.i.d.:

$\text{Var}(\bar{X}_n) = \text{Var}\!\left(\frac{1}{n}\sum_{i=1}^n X_i\right) = \frac{1}{n^2} \sum_{i=1}^n \text{Var}(X_i) = \frac{\sigma^2}{n}.$

Apply Chebyshev's inequality

By Chebyshev: for any $\epsilon > 0$ ,

$\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) \leq \frac{\text{Var}(\bar{X}_n)}{\epsilon^2} = \frac{\sigma^2}{n\epsilon^2}.$

Take the limit

As $n \to \infty$ :

$0 \leq \mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) \leq \frac{\sigma^2}{n\epsilon^2} \to 0.$

By the squeeze theorem, $\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) \to 0$ . $\blacksquare$

,

Alternative Proof via Characteristic Functions

The WLLN can also be proved using characteristic functions, as Caire does in the course. The Ch.F of $\bar{X}_n$ is ${\phi_X}_{\bar{X}_n}(u) = ({\phi_X}_{X}(u/n))^n$ . Taylor-expanding ${\phi_X}_{X}(u/n) = 1 + j\mu u/n + o(1/n)$ and using the limit $(1 + a/n + o(1/n))^n \to e^a$ , we get ${\phi_X}_{\bar{X}_n}(u) \to e^{j\mu u}$ , which is the Ch.F of the constant $\mu$ . By the Levy continuity theorem, $\bar{X}_n \xrightarrow{d} \mu$ , and since the limit is a constant, convergence in distribution upgrades to convergence in probability.

This proof only requires $\mathbb{E}[|X_1|] < \infty$ (no finite variance needed), so it is strictly more general than the Chebyshev proof above.

Example: Empirical Frequency of a Biased Coin

Let $X_1, X_2, \ldots$ be i.i.d. $\text{Bernoulli}(p)$ with $p = 0.3$ . How many coin tosses $n$ are needed so that the empirical frequency $\bar{X}_n$ is within $0.01$ of the true probability $p$ with probability at least $0.95$ ?

Solution

Set up the Chebyshev bound

We need $\mathbb{P}(|\bar{X}_n - 0.3| \geq 0.01) \leq 0.05$ . By Chebyshev:

$\mathbb{P}(|\bar{X}_n - p| \geq \epsilon) \leq \frac{p(1-p)}{n\epsilon^2} = \frac{0.3 \times 0.7}{n \times 0.01^2} = \frac{0.21}{n \times 10^{-4}} = \frac{2100}{n}.$

Solve for $n$

We need $2100/n \leq 0.05$ , giving $n \geq 42{,}000$ .

This is a conservative bound — Chebyshev is loose. The CLT (Section 11.4) will give the much tighter estimate $n \approx 8{,}067$ .

WLLN in Action: Sample Mean Trajectories

Watch multiple independent trajectories of $\bar{X}_n$ converge to $\mu$ as $n$ grows. Choose the underlying distribution and observe how the convergence rate depends on the variance.

Parameters

Distribution

Distribution parameter0.5

Bernoulli: p; Exponential: lambda; Uniform: b (on [0,b]); Gaussian: sigma

n_{\max}

2000

Trajectories10

Why This Matters: Monte Carlo Simulation in Communications

Every bit error rate (BER) simulation in wireless communications is a direct application of the WLLN. We transmit $n$ symbols through a simulated channel, count the errors $E$ , and report $\hat{P}_e = E/n$ as the estimated error probability. The WLLN guarantees $\hat{P}_e \xrightarrow{P} P_e$ as $n \to \infty$ .

The practical question is: how large must $n$ be? For $P_e = 10^{-5}$ (a typical target in 5G), the Chebyshev bound suggests $n \approx P_e(1-P_e)/(\epsilon^2 \delta)$ , but the CLT (Section 11.4) gives the more useful rule of thumb: we need roughly $100/P_e$ errors, so $n \approx 10^7$ symbol transmissions.

See full treatment in The Linear MMSE Estimator

⚠️Engineering Note

Confidence Intervals for Monte Carlo BER Estimates

The Chebyshev-based WLLN bound is overly conservative for practical Monte Carlo design. In practice, we use the CLT approximation:

$\hat{P}_e \pm z_{\alpha/2} \sqrt{\frac{\hat{P}_e(1-\hat{P}_e)}{n}}$

where $z_{\alpha/2} = 1.96$ for a 95% confidence interval. For $P_e = 10^{-5}$ and a relative accuracy of 10%, this requires $n \approx 3.84 \times 10^7$ — feasible but not cheap. This is why importance sampling and other variance reduction techniques are essential in practice.

Practical Constraints

•
For $P_e < 10^{-6}$ , direct Monte Carlo becomes impractical (> $10^9$ samples)
•
Importance sampling can reduce the required sample count by orders of magnitude

Quick Check

The Chebyshev-based proof of the WLLN requires which condition on the i.i.d. sequence?

Finite mean only

Finite variance

Finite fourth moment

The distribution must be continuous

Correction:

Finite variance

Correct. Chebyshev's inequality uses $\text{Var}(\bar{X}_n) = \sigma^2/n$ , which requires $\sigma^2 < \infty$ .

Weak Law of Large Numbers

States that $\bar{X}_n \xrightarrow{P} \mu$ for i.i.d. $\{X_i\}$ with finite mean $\mu$ . "Weak" refers to convergence in probability, as opposed to the "strong" law which gives almost sure convergence.

Common Mistake: Chebyshev's Bound Is Loose — Do Not Use It for Design

Mistake:

Using the WLLN's Chebyshev bound $\mathbb{P}(|\bar{X}_n - \mu| \geq \epsilon) \leq \sigma^2/(n\epsilon^2)$ to determine the required sample size in a real system.

Correction:

The Chebyshev bound is distribution-free and therefore very conservative. For system design, use the CLT normal approximation (Section 11.4) or, for small error probabilities, the Chernoff/Hoeffding exponential bounds (FSP Ch. 9). The Chebyshev bound is a proof tool, not a design tool.