Ferkans — Interactive Telecom Tutor

Why Multiple Modes of Convergence?

In calculus, a sequence of numbers either converges or it does not. But a sequence of random variables can converge in several distinct senses, depending on how strictly we demand agreement between $X_n$ and the limit $X$ . The distinction is not pedantic — it determines which tools we can use and which conclusions we can draw. The Weak Law guarantees convergence in probability; the Strong Law upgrades this to almost sure convergence; the CLT delivers convergence in distribution. Each mode tells a different story about what happens as $n \to \infty$ .

Definition:
Almost Sure Convergence

A sequence $\{X_n\}$ converges to $X$ almost surely (a.s.), written $X_n \xrightarrow{\text{a.s.}} X$ , if

$\mathbb{P}\!\left(\lim_{n \to \infty} X_n = X\right) = 1.$

Equivalently, for every $\epsilon > 0$ :

$\mathbb{P}\!\left(\bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty} \{|X_k - X| \geq \epsilon\}\right) = 0.$

The set of outcomes $\omega$ where $X_n(\omega) \not\to X(\omega)$ has probability zero.

Almost sure convergence is pathwise: for (almost) every realization of the random experiment, the sequence of numbers $X_1(\omega), X_2(\omega), \ldots$ converges to $X(\omega)$ in the ordinary calculus sense.

Definition:
Convergence in Probability

A sequence $\{X_n\}$ converges to $X$ in probability, written $X_n \xrightarrow{P} X$ , if for every $\epsilon > 0$ :

$\lim_{n \to \infty} \mathbb{P}(|X_n - X| \geq \epsilon) = 0.$

This says that the probability of a large deviation between $X_n$ and $X$ vanishes, but it does not preclude occasional excursions.

Definition:
Convergence in $r$ -th Mean ( $L^r$ )

For $r \geq 1$ , a sequence $\{X_n\}$ converges to $X$ in $L^r$ , written $X_n \xrightarrow{L^r} X$ , if

$\lim_{n \to \infty} \mathbb{E}\!\left[|X_n - X|^r\right] = 0.$

For $r = 2$ , this is mean-square convergence: $\mathbb{E}[(X_n - X)^2] \to 0$ .

$L^r$ convergence controls the $r$ -th moment of the deviation. It is the strongest mode when $r$ is large, but it requires the moments to exist.

Definition:
Convergence in Distribution

A sequence $\{X_n\}$ converges to $X$ in distribution, written $X_n \xrightarrow{d} X$ , if

$\lim_{n \to \infty} F_{X_n}(x) = F_X(x)$

at every point $x$ where $F_X$ is continuous.

Equivalently, by the Levy continuity theorem (TLevy Continuity Theorem): $X_n \xrightarrow{d} X$ if and only if ${\phi_X}_{X_n}(u) \to {\phi_X}_{X}(u)$ for all $u \in \mathbb{R}$ .

Convergence in distribution is the weakest mode. It says nothing about individual realizations — only that the CDFs align. The random variables need not even live on the same probability space.

Definition:
Sample Mean

Given a sequence $X_1, X_2, \ldots$ of random variables, the sample mean (empirical average) is

$\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i.$

When the $X_i$ are i.i.d. with mean $\mu$ , we have $\mathbb{E}[\bar{X}_n] = \mu$ and $\text{Var}(\bar{X}_n) = \sigma^2/n$ . The law of large numbers describes the sense in which $\bar{X}_n$ converges to $\mu$ .

Theorem: Relationships Between Convergence Modes

The four modes of convergence satisfy the following implications:

$X_n \xrightarrow{\text{a.s.}} X \;\Longrightarrow\; X_n \xrightarrow{P} X$
$X_n \xrightarrow{L^r} X \;\Longrightarrow\; X_n \xrightarrow{P} X$ (for any $r \geq 1$ )
$X_n \xrightarrow{P} X \;\Longrightarrow\; X_n \xrightarrow{d} X$

No other general implications hold. In particular:

Convergence in probability does not imply a.s. convergence.
Convergence in distribution does not imply convergence in probability.
$L^r$ convergence and a.s. convergence are not comparable in general.

Exception: If $X$ is a constant $c$ , then $X_n \xrightarrow{d} c$ implies $X_n \xrightarrow{P} c$ .

Almost sure convergence controls every sample path; $L^r$ controls the average deviation; convergence in probability allows rare large deviations; convergence in distribution only matches the histograms. Each is weaker than the one above.

Proof

a.s. implies in probability

Fix $\epsilon > 0$ . Define $A_n = \{|X_n - X| \geq \epsilon\}$ . Almost sure convergence means $\mathbb{P}(\limsup_n A_n) = 0$ . Since $\{A_n \text{ i.o.}\} = \limsup_n A_n$ , we have $\mathbb{P}(A_n) \to 0$ (if infinitely many $A_n$ had probability bounded away from zero, $\limsup$ would have positive probability). Hence $X_n \xrightarrow{P} X$ .

$L^r$ implies in probability (Markov)

By Markov's inequality applied to $|X_n - X|^r$ :

$\mathbb{P}(|X_n - X| \geq \epsilon) = \mathbb{P}(|X_n - X|^r \geq \epsilon^r) \leq \frac{\mathbb{E}[|X_n - X|^r]}{\epsilon^r} \to 0.$

In probability implies in distribution

Fix $x$ where $F_X$ is continuous. For any $\delta > 0$ :

$F_{X_n}(x) = \mathbb{P}(X_n \leq x) \leq \mathbb{P}(X \leq x + \delta) + \mathbb{P}(|X_n - X| > \delta).$

Taking $n \to \infty$ : $\limsup F_{X_n}(x) \leq F_X(x + \delta)$ . Similarly $\liminf F_{X_n}(x) \geq F_X(x - \delta)$ . Since $F_X$ is continuous at $x$ , let $\delta \to 0$ : $\lim F_{X_n}(x) = F_X(x)$ . $\blacksquare$

Four Modes of Convergence

Mode	Notation	Definition	Requires same space?	Strength
Almost sure	$X_n \xrightarrow{\text{a.s.}} X$	$\mathbb{P}(\lim X_n = X) = 1$	Yes	Strong
In probability	$X_n \xrightarrow{P} X$	$\mathbb{P}(\|X_n - X\| \geq \epsilon) \to 0$	Yes	Medium
$L^r$ mean	$X_n \xrightarrow{L^r} X$	$\mathbb{E}[\|X_n - X\|^r] \to 0$	Yes	Medium
In distribution	$X_n \xrightarrow{d} X$	$F_{X_n}(x) \to F_X(x)$ at cont. points	No	Weak

Example: Convergence in Probability but Not Almost Surely

Construct a sequence $\{X_n\}$ that converges to $0$ in probability but not almost surely. This shows that the implication a.s. $\Rightarrow$ in probability cannot be reversed.

Solution

The typewriter sequence

On $(\Omega, \mathcal{F}, \mathbb{P}) = ([0,1], \mathcal{B}[0,1], \text{Lebesgue})$ , define $X_n(\omega) = \mathbf{1}_{[(k-1)/m,\, k/m]}(\omega)$ where $n$ corresponds to the pair $(m, k)$ obtained by enumerating: $n=1 \mapsto (1,1)$ , $n=2 \mapsto (2,1)$ , $n=3 \mapsto (2,2)$ , $n=4 \mapsto (3,1)$ , and so on.

Convergence in probability

For $\epsilon \in (0,1]$ : $\mathbb{P}(|X_n| \geq \epsilon) = \mathbb{P}(X_n = 1) = 1/m \to 0$ as $n \to \infty$ (since $m \to \infty$ ). So $X_n \xrightarrow{P} 0$ .

Failure of a.s. convergence

For every $\omega \in [0,1]$ , there are infinitely many $n$ with $X_n(\omega) = 1$ (since as $m$ grows, $\omega$ will eventually fall in $[(k-1)/m, k/m]$ for some $k$ ). Thus $X_n(\omega) \not\to 0$ for any $\omega$ , so a.s. convergence fails completely.

Example: $L^r$ Convergence Without Almost Sure Convergence

Let $X_n = n \cdot \mathbf{1}_{[0, 1/n^2]}$ on $[0,1]$ with Lebesgue measure. Show that $X_n \xrightarrow{L^1} 0$ but $X_n$ does not converge to $0$ a.s. Then modify the example to get $L^1$ convergence with a.s. convergence.

Solution

$L^1$ convergence

$\mathbb{E}[|X_n|] = n \cdot (1/n^2) = 1/n \to 0$ . So $X_n \xrightarrow{L^1} 0$ .

Almost sure convergence also holds here

Actually, $\sum_n \mathbb{P}(X_n \neq 0) = \sum_n 1/n^2 < \infty$ , so by Borel-Cantelli, $X_n \neq 0$ only finitely often a.s., hence $X_n \to 0$ a.s.

For a true counterexample, use the typewriter sequence from EConvergence in Probability but Not Almost Surely scaled by $1/\sqrt{m}$ : $X_n = (1/\sqrt{m}) \cdot \mathbf{1}_{[(k-1)/m, k/m]}$ . Then $\mathbb{E}[X_n^2] = 1/m^2 \to 0$ (so $L^2$ convergence), but a.s. convergence fails.

Convergence Modes: A Visual Comparison

Compare different sequences that illustrate convergence in probability vs. almost sure convergence. Each trajectory is a single realization.

Parameters

n

(sequence length)500

Number of trajectories5

Example type

Common Mistake: Convergence in Distribution $\neq$ Convergence in Probability

Mistake:

Assuming that $X_n \xrightarrow{d} X$ means $X_n$ is "close to" $X$ in some probabilistic sense, and using this to conclude statements about $|X_n - X|$ .

Correction:

Convergence in distribution says only that the CDFs agree in the limit — the random variables need not even be defined on the same probability space. You cannot write $\mathbb{P}(|X_n - X| > \epsilon)$ unless they share a common space.

Exception: If the limit $X = c$ is a constant, then convergence in distribution does imply convergence in probability: $F_{X_n}(x) \to \mathbf{1}_{x \geq c}$ forces all the probability mass to collapse to $c$ .

Common Mistake: A.S. Convergence Is Not Pointwise Convergence Everywhere

Mistake:

Interpreting $X_n \xrightarrow{\text{a.s.}} X$ as " $X_n(\omega) \to X(\omega)$ for every $\omega \in \Omega$ ." This would be sure convergence, which is strictly stronger.

Correction:

Almost sure convergence allows a set $N \subset \Omega$ with $\mathbb{P}(N) = 0$ where convergence fails. The word "almost" is doing essential work: there may be exceptional outcomes, but collectively they have zero probability.

Quick Check

If $X_n \xrightarrow{L^2} X$ , which of the following is guaranteed?

$X_n \xrightarrow{\text{a.s.}} X$

$X_n \xrightarrow{P} X$

$X_n \xrightarrow{L^3} X$

None of the above

Correction:

X_n \xrightarrow{P} X

$L^r$ convergence implies convergence in probability by Markov's inequality: $\mathbb{P}(|X_n - X| \geq \epsilon) \leq \mathbb{E}[|X_n - X|^2]/\epsilon^2 \to 0$ .

Historical Note: The Long Road to Clarifying Convergence

1909–1933

The distinction between convergence in probability and almost sure convergence was not immediately clear in the early development of probability theory. Emile Borel (1909) and Francesco Cantelli (1917) established the lemmas that connect these concepts. The full taxonomy of convergence modes was systematized by Andrei Kolmogorov in his Grundbegriffe der Wahrscheinlichkeitsrechnung (1933), which placed probability on a rigorous measure-theoretic foundation. It was only after Kolmogorov that the subtle differences between the modes — and the counterexamples showing they are genuinely distinct — became standard textbook material.

Almost Sure Convergence

$X_n \xrightarrow{\text{a.s.}} X$ means $\mathbb{P}(\lim_{n \to \infty} X_n = X) = 1$ . The sequence converges pathwise except on a set of probability zero.

Convergence in Probability

$X_n \xrightarrow{P} X$ means $\mathbb{P}(|X_n - X| \geq \epsilon) \to 0$ for every $\epsilon > 0$ . The probability of large deviations vanishes, but occasional excursions are allowed.

Convergence in Distribution

$X_n \xrightarrow{d} X$ means $F_{X_n}(x) \to F_X(x)$ at every continuity point of $F_X$ . The weakest mode of convergence — only the shape of the distribution converges.

Key Takeaway

The convergence hierarchy is: almost sure $\Rightarrow$ in probability $\Rightarrow$ in distribution, with $L^r$ convergence also implying convergence in probability. The reverse implications fail in general, except that convergence in distribution to a constant upgrades to convergence in probability.

Modes of Convergence

Why Multiple Modes of Convergence?

Definition: Almost Sure Convergence

Definition: Convergence in Probability

Definition: Convergence in rrr-th Mean (LrL^rLr)

Definition: Convergence in Distribution

Definition: Sample Mean

Theorem: Relationships Between Convergence Modes

a.s. implies in probability

$L^r$ implies in probability (Markov)

In probability implies in distribution

Four Modes of Convergence

Example: Convergence in Probability but Not Almost Surely

The typewriter sequence

Convergence in probability

Failure of a.s. convergence

Example: LrL^rLr Convergence Without Almost Sure Convergence

$L^1$ convergence

Almost sure convergence also holds here

Convergence Modes: A Visual Comparison

Parameters

Common Mistake: Convergence in Distribution ≠\neq= Convergence in Probability

Common Mistake: A.S. Convergence Is Not Pointwise Convergence Everywhere

Quick Check

Historical Note: The Long Road to Clarifying Convergence

Almost Sure Convergence

Convergence in Probability

Convergence in Distribution

Key Takeaway

Definition:
Almost Sure Convergence

Definition:
Convergence in Probability

Definition:
Convergence in $r$ -th Mean ( $L^r$ )

Definition:
Convergence in Distribution

Definition:
Sample Mean

Example: $L^r$ Convergence Without Almost Sure Convergence

Common Mistake: Convergence in Distribution $\neq$ Convergence in Probability