Ferkans — Interactive Telecom Tutor

From Ensemble to Time Averages

All the statistical quantities we have defined — mean, autocorrelation, variance — are ensemble averages: expectations over the probability space $(\Omega, \mathcal{F}, \mathbb{P})$ . But in practice, we typically observe a single realization of the process (one received signal, one channel trace). How can we estimate $\mu = \mathbb{E}[X(t)]$ from a single sample path? The answer is ergodicity: under certain conditions, time averages computed from a single long realization converge to the ensemble averages. Ergodicity is the assumption that justifies virtually all practical estimation in communications.

Definition:
Time Average

For a continuous-time process $\{X(t)\}$ , the time average over $[-T, T]$ is $\langle X \rangle_T = \frac{1}{2T}\int_{-T}^{T} X(t)\,dt.$

For a discrete-time process $\{X_n\}$ , the time average over $\{-N, \ldots, N\}$ is $\langle X \rangle_N = \frac{1}{2N+1}\sum_{n=-N}^{N} X_n.$

Definition:
Mean-Ergodic Process

A WSS process $\{X(t)\}$ is mean-ergodic if its time average converges to the ensemble mean in mean square: $\lim_{T \to \infty} \mathbb{E}\left[\left|\langle X \rangle_T - \mu\right|^2\right] = 0,$ i.e., $\langle X \rangle_T \xrightarrow{\text{m.s.}} \mu$ .

Mean-ergodicity says that a single long observation suffices to estimate the mean. This is weaker than full ergodicity (where time averages of all functions of the process converge to ensemble averages), but it is the most practically important case.

Theorem: Condition for Mean-Ergodicity

A WSS process $\{X(t)\}$ with autocovariance $c_{XX}(\tau)$ is mean-ergodic if and only if $\lim_{T \to \infty} \frac{1}{2T}\int_{-2T}^{2T}\left(1 - \frac{|\tau|}{2T}\right)c_{XX}(\tau)\,d\tau = 0.$

Sufficient condition: $\{X(t)\}$ is mean-ergodic if $\lim_{|\tau| \to \infty} c_{XX}(\tau) = 0.$

For discrete-time: $\{X_n\}$ is mean-ergodic if $\lim_{|k| \to \infty} c_{xx}[k] = 0$ .

The condition $c_{XX}(\tau) \to 0$ as $|\tau| \to \infty$ means that the process "forgets" its past — distant samples are approximately uncorrelated. This ensures that the time average $\langle X \rangle_T$ averages over effectively independent observations and converges to $\mu$ by a law-of-large-numbers-type argument.

Proof

Compute the variance of the time average

$\text{Var}(\langle X \rangle_T) = \mathbb{E}\left[\left|\langle X \rangle_T - \mu\right|^2\right] = \frac{1}{(2T)^2}\int_{-T}^{T}\int_{-T}^{T} c_{XX}(t_1 - t_2)\,dt_1\,dt_2.KATEXPLACEHOLDER0END\text{Var}(\langle X \rangle_T) = \frac{1}{2T}\int_{-2T}^{2T}\left(1 - \frac{|\tau|}{2T}\right)c_{XX}(\tau)\,d\tau.$ $

Apply the sufficient condition

If $c_{XX}(\tau) \to 0$ as $|\tau| \to \infty$ , then for any $\epsilon > 0$ there exists $\tau_0$ such that $|c_{XX}(\tau)| < \epsilon$ for $|\tau| > \tau_0$ . Split the integral into $|\tau| \leq \tau_0$ and $|\tau| > \tau_0$ . The first part is $O(1/T) \to 0$ , and the second part is bounded by $\epsilon$ . Since $\epsilon$ is arbitrary, $\text{Var}(\langle X \rangle_T) \to 0$ .

Example: Mean-Ergodicity of an Exponentially Correlated Process

Let $\{X(t)\}$ be a zero-mean WSS process with $c_{XX}(\tau) = \sigma^2 e^{-\alpha|\tau|}$ , $\alpha > 0$ . Is $\{X(t)\}$ mean-ergodic?

Solution

Check the sufficient condition

$\lim_{|\tau| \to \infty} c_{XX}(\tau) = \lim_{|\tau| \to \infty} \sigma^2 e^{-\alpha|\tau|} = 0$ . The sufficient condition is satisfied, so $\{X(t)\}$ is mean-ergodic.

Compute the rate of convergence

$\text{Var}(\langle X \rangle_T) = \frac{1}{2T}\int_{-2T}^{2T}\left(1 - \frac{|\tau|}{2T}\right)\sigma^2 e^{-\alpha|\tau|}\,d\tau.KATEXPLACEHOLDER0END\text{Var}(\langle X \rangle_T) \approx \frac{\sigma^2}{2T}\int_{-\infty}^{\infty} e^{-\alpha|\tau|}\,d\tau = \frac{\sigma^2}{2T}\cdot\frac{2}{\alpha} = \frac{\sigma^2}{\alpha T}.$ $The variance of the time-average estimate decreases as$ 1/T$.

Example: A Non-Ergodic WSS Process

Let $X(t) = A$ for all $t$ , where $A$ is a random variable with $\mathbb{E}[A] = 0$ and $\text{Var}(A) = \sigma^2$ . Show that $\{X(t)\}$ is WSS but not mean-ergodic.

Solution

Verify WSS

$\mu_X(t) = \mathbb{E}[A] = 0$ (constant). $r_{XX}(\tau) = \mathbb{E}[A^2] = \sigma^2$ (constant, depends trivially on $\tau$ ). So $\{X(t)\}$ is WSS.

Compute the time average

$\langle X \rangle_T = \frac{1}{2T}\int_{-T}^{T} A\,dt = A$ for every $T$ . The time average equals $A$ , not $\mu = 0$ .

Check convergence

$\mathbb{E}[|\langle X \rangle_T - \mu|^2] = \mathbb{E}[A^2] = \sigma^2 > 0$ for all $T$ . The variance does not go to zero, so $\{X(t)\}$ is not mean-ergodic.

Why ergodicity fails

The autocovariance $c_{XX}(\tau) = \sigma^2$ does not decay to zero — the process has "infinite memory" (every value equals $A$ ). Different realizations produce different constant sample paths, and averaging over time cannot recover the ensemble mean from any one of them.

Ergodic Time-Average Convergence

Watch the running time average $\langle X \rangle_T$ converge (or not) to the ensemble mean as $T$ increases. Compare an ergodic process (exponential ACF) with a non-ergodic one (constant random variable).

Parameters

Process Type

Observation Length

N

1000

Number of Realizations5

Random Seed42

Definition:
Correlation-Ergodic Process

A WSS process $\{X(t)\}$ is correlation-ergodic if the time-averaged product converges to the autocorrelation: $\frac{1}{2T}\int_{-T}^{T} X(t+\tau)X^*(t)\,dt \xrightarrow{\text{m.s.}} r_{XX}(\tau)$ as $T \to \infty$ , for every $\tau$ .

A sufficient condition (for Gaussian WSS processes) is $c_{XX}(\tau) \to 0$ as $|\tau| \to \infty$ .

Theorem: Ergodic Theorem for Stationary Processes

If $\{X(t)\}$ is strict-sense stationary and $\mathbb{E}[|X(t)|] < \infty$ , then $\frac{1}{2T}\int_{-T}^{T} g(X(t))\,dt \xrightarrow{\text{a.s.}} \mathbb{E}[g(X(0))]$ as $T \to \infty$ , for every measurable function $g$ with $\mathbb{E}[|g(X(0))|] < \infty$ , provided the process is ergodic (the shift-invariant $\sigma$ -algebra is trivial).

For discrete-time: $\frac{1}{2N+1}\sum_{n=-N}^{N} g(X_n) \xrightarrow{\text{a.s.}} \mathbb{E}[g(X_0)]$ .

This is the Birkhoff ergodic theorem applied to stationary processes. It says that time averages of any function of the process converge to ensemble averages, provided the process is ergodic in the measure-theoretic sense. Mean-ergodicity is the special case $g(x) = x$ .

Proof

Proof sketch

The full proof uses the Birkhoff (pointwise) ergodic theorem from ergodic theory, which is beyond our scope. The key idea: define the time-shift operator $T_\tau : X(t) \mapsto X(t + \tau)$ . Stationarity means $T_\tau$ is measure-preserving. The ergodic theorem says that for measure-preserving transformations with trivial invariant sets, time averages converge a.s. to spatial (ensemble) averages.

Practical Implications of Ergodicity

In communications, ergodicity has far-reaching consequences:

Channel estimation: We can estimate the channel's mean power and autocorrelation from a single long observation, rather than requiring an ensemble of independent channel realizations.
Bit error rate: The long-run fraction of erroneous bits in a single transmission equals the probability of bit error (BER).
Capacity: The achievable rate on a stationary ergodic channel converges to the ergodic capacity $\mathbb{E}[\log(1 + \text{SNR})]$ , averaged over the fading distribution.
Estimation: Sample mean and sample autocorrelation are consistent estimators of $\mu$ and $r_{XX}(\tau)$ .

Quick Check

A WSS process has autocovariance $c_{XX}(\tau) = \sigma^2 \cos(2\pi f_0 \tau)$ . Is it mean-ergodic?

Yes, because $c_{XX}(\tau)$ is bounded

No, because $c_{XX}(\tau)$ does not decay to zero

Yes, because the process is WSS

Correction:

No, because

c_{XX}(\tau)

does not decay to zero

$c_{XX}(\\tau) = \\sigma^2 \\cos(2\\pi f_0 \\tau)$ oscillates forever and does not converge to zero. The sufficient condition for mean-ergodicity fails. In fact, the process is not mean-ergodic.

Common Mistake: Assuming Ergodicity Without Verification

Mistake:

Treating every stationary process as ergodic and equating time averages with ensemble averages without checking conditions.

Correction:

Ergodicity requires additional conditions beyond stationarity — specifically, the autocovariance must decay to zero (sufficient condition for mean-ergodicity). A stationary process with persistent correlations (e.g., $X(t) = A$ with random $A$ ) is not ergodic. Always check whether the autocovariance decays before assuming ergodicity.

Common Mistake: Confusing Ergodicity with Finite-Sample Accuracy

Mistake:

Believing that ergodicity guarantees accurate estimates from short observations.

Correction:

Ergodicity guarantees convergence as $T \to \infty$ . For finite $T$ , the time-average estimate has variance $\text{Var}(\langle X \rangle_T) \approx \frac{1}{2T}\int c_{XX}(\tau)\,d\tau$ , which may be large if the correlation decays slowly. The "effective number of independent samples" is approximately $2T / \tau_{\text{corr}}$ , where $\tau_{\text{corr}} = \int_0^\infty |c_{XX}(\tau)|/c_{XX}(0)\,d\tau$ is the correlation time.

⚠️Engineering Note

Correlation Time and Estimation Quality

The correlation time $\tau_c = \int_0^\infty |c_{XX}(\tau)| / c_{XX}(0)\,d\tau$ determines how many "effectively independent" samples a time interval of length $T$ contains: roughly $T / \tau_c$ . For estimating the mean from a single realization of length $T$ , the variance of the estimate scales as $\sigma^2 \tau_c / T$ .

In wireless channel estimation, the coherence time $T_c$ plays the role of $\tau_c$ . A pilot-based estimator using $N_p$ pilot symbols spaced at $T_s \ll T_c$ has effective degrees of freedom approximately $N_p T_s / T_c$ , not $N_p$ .

Why This Matters: Ergodic Capacity of Fading Channels

For a stationary ergodic fading channel with gain $H(t)$ and noise power $N_0$ , the ergodic capacity is $C = \mathbb{E}[\log_2(1 + |H|^2 \cdot \text{SNR})]$ . Ergodicity guarantees that a single codeword spanning many fading realizations (i.e., codeword length $\gg$ coherence time) experiences all channel states, and the achievable rate equals this ensemble average. When the codeword length is comparable to the coherence time, we enter the outage regime, and ergodic capacity no longer applies — a fundamentally different analysis is needed.

See full treatment in Chapter 14

Historical Note: Birkhoff's Ergodic Theorem

1931

George David Birkhoff proved his pointwise ergodic theorem in 1931, establishing that time averages of integrable functions converge almost surely for measure-preserving transformations. The term "ergodic" comes from the Greek ergon (work) and hodos (path), coined by Boltzmann in statistical mechanics to describe systems that visit all accessible states over time. Birkhoff's theorem provided the rigorous mathematical foundation for Boltzmann's physical intuition and, decades later, became the theoretical justification for estimating channel statistics from single observations in communications.

🎓CommIT Contribution(1999)

Ergodic vs. Outage Capacity in Fading Channels

G. Caire, S. Shamai — IEEE Trans. Inform. Theory, vol. 45, no. 6, pp. 2007--2019

Caire and Shamai (1999) provided a unified framework for analyzing fading channels with various levels of channel state information at the transmitter and receiver. Their work clarified when ergodic capacity (the ensemble average $\mathbb{E}[\log(1 + \text{SNR}\cdot|H|^2)]$ ) applies versus when outage-based metrics are appropriate. The distinction hinges on whether the coding block length spans many independent fading realizations (ergodic regime) or few (quasi-static regime). This paper exemplifies how the abstract concept of ergodicity has direct, quantitative implications for system design.

ergodic-capacityfadingView Paper →

Ergodic Process

A stationary process for which time averages converge to ensemble averages. Mean-ergodicity requires the autocovariance to decay to zero.

Time Average

$\langle X \rangle_T = \frac{1}{2T}\int_{-T}^{T} X(t)\,dt$ . The average of a single realization over a time window. Converges to the ensemble mean for ergodic processes.

Related: Ergodic Process

Key Takeaway

Ergodicity bridges the gap between mathematical expectation (over the ensemble) and practical estimation (from a single time series). A WSS process is mean-ergodic if its autocovariance decays to zero — a condition satisfied by most physical processes with finite memory. Without ergodicity, statistical estimation from a single observation is fundamentally impossible.

Ergodicity

From Ensemble to Time Averages

Definition: Time Average

Definition: Mean-Ergodic Process

Theorem: Condition for Mean-Ergodicity

Compute the variance of the time average

Apply the sufficient condition

Example: Mean-Ergodicity of an Exponentially Correlated Process

Check the sufficient condition

Compute the rate of convergence

Example: A Non-Ergodic WSS Process

Verify WSS

Compute the time average

Check convergence

Why ergodicity fails

Ergodic Time-Average Convergence

Parameters

Definition: Correlation-Ergodic Process

Theorem: Ergodic Theorem for Stationary Processes

Proof sketch

Practical Implications of Ergodicity

Quick Check

Common Mistake: Assuming Ergodicity Without Verification

Common Mistake: Confusing Ergodicity with Finite-Sample Accuracy

Correlation Time and Estimation Quality

Why This Matters: Ergodic Capacity of Fading Channels

Historical Note: Birkhoff's Ergodic Theorem

Ergodic vs. Outage Capacity in Fading Channels

Ergodic Process

Time Average

Key Takeaway

Definition:
Time Average

Definition:
Mean-Ergodic Process

Definition:
Correlation-Ergodic Process