Ergodicity
From Ensemble to Time Averages
All the statistical quantities we have defined β mean, autocorrelation, variance β are ensemble averages: expectations over the probability space . But in practice, we typically observe a single realization of the process (one received signal, one channel trace). How can we estimate from a single sample path? The answer is ergodicity: under certain conditions, time averages computed from a single long realization converge to the ensemble averages. Ergodicity is the assumption that justifies virtually all practical estimation in communications.
Definition: Time Average
Time Average
For a continuous-time process , the time average over is
For a discrete-time process , the time average over is
Definition: Mean-Ergodic Process
Mean-Ergodic Process
A WSS process is mean-ergodic if its time average converges to the ensemble mean in mean square: i.e., .
Mean-ergodicity says that a single long observation suffices to estimate the mean. This is weaker than full ergodicity (where time averages of all functions of the process converge to ensemble averages), but it is the most practically important case.
Theorem: Condition for Mean-Ergodicity
A WSS process with autocovariance is mean-ergodic if and only if
Sufficient condition: is mean-ergodic if
For discrete-time: is mean-ergodic if .
The condition as means that the process "forgets" its past β distant samples are approximately uncorrelated. This ensures that the time average averages over effectively independent observations and converges to by a law-of-large-numbers-type argument.
Compute the variance of the time average
$
Apply the sufficient condition
If as , then for any there exists such that for . Split the integral into and . The first part is , and the second part is bounded by . Since is arbitrary, .
Example: Mean-Ergodicity of an Exponentially Correlated Process
Let be a zero-mean WSS process with , . Is mean-ergodic?
Check the sufficient condition
. The sufficient condition is satisfied, so is mean-ergodic.
Compute the rate of convergence
1/T$.
Example: A Non-Ergodic WSS Process
Let for all , where is a random variable with and . Show that is WSS but not mean-ergodic.
Verify WSS
(constant). (constant, depends trivially on ). So is WSS.
Compute the time average
for every . The time average equals , not .
Check convergence
for all . The variance does not go to zero, so is not mean-ergodic.
Why ergodicity fails
The autocovariance does not decay to zero β the process has "infinite memory" (every value equals ). Different realizations produce different constant sample paths, and averaging over time cannot recover the ensemble mean from any one of them.
Ergodic Time-Average Convergence
Watch the running time average converge (or not) to the ensemble mean as increases. Compare an ergodic process (exponential ACF) with a non-ergodic one (constant random variable).
Parameters
Definition: Correlation-Ergodic Process
Correlation-Ergodic Process
A WSS process is correlation-ergodic if the time-averaged product converges to the autocorrelation: as , for every .
A sufficient condition (for Gaussian WSS processes) is as .
Theorem: Ergodic Theorem for Stationary Processes
If is strict-sense stationary and , then as , for every measurable function with , provided the process is ergodic (the shift-invariant -algebra is trivial).
For discrete-time: .
This is the Birkhoff ergodic theorem applied to stationary processes. It says that time averages of any function of the process converge to ensemble averages, provided the process is ergodic in the measure-theoretic sense. Mean-ergodicity is the special case .
Proof sketch
The full proof uses the Birkhoff (pointwise) ergodic theorem from ergodic theory, which is beyond our scope. The key idea: define the time-shift operator . Stationarity means is measure-preserving. The ergodic theorem says that for measure-preserving transformations with trivial invariant sets, time averages converge a.s. to spatial (ensemble) averages.
Practical Implications of Ergodicity
In communications, ergodicity has far-reaching consequences:
- Channel estimation: We can estimate the channel's mean power and autocorrelation from a single long observation, rather than requiring an ensemble of independent channel realizations.
- Bit error rate: The long-run fraction of erroneous bits in a single transmission equals the probability of bit error (BER).
- Capacity: The achievable rate on a stationary ergodic channel converges to the ergodic capacity , averaged over the fading distribution.
- Estimation: Sample mean and sample autocorrelation are consistent estimators of and .
Quick Check
A WSS process has autocovariance . Is it mean-ergodic?
Yes, because is bounded
No, because does not decay to zero
Yes, because the process is WSS
oscillates forever and does not converge to zero. The sufficient condition for mean-ergodicity fails. In fact, the process is not mean-ergodic.
Common Mistake: Assuming Ergodicity Without Verification
Mistake:
Treating every stationary process as ergodic and equating time averages with ensemble averages without checking conditions.
Correction:
Ergodicity requires additional conditions beyond stationarity β specifically, the autocovariance must decay to zero (sufficient condition for mean-ergodicity). A stationary process with persistent correlations (e.g., with random ) is not ergodic. Always check whether the autocovariance decays before assuming ergodicity.
Common Mistake: Confusing Ergodicity with Finite-Sample Accuracy
Mistake:
Believing that ergodicity guarantees accurate estimates from short observations.
Correction:
Ergodicity guarantees convergence as . For finite , the time-average estimate has variance , which may be large if the correlation decays slowly. The "effective number of independent samples" is approximately , where is the correlation time.
Correlation Time and Estimation Quality
The correlation time determines how many "effectively independent" samples a time interval of length contains: roughly . For estimating the mean from a single realization of length , the variance of the estimate scales as .
In wireless channel estimation, the coherence time plays the role of . A pilot-based estimator using pilot symbols spaced at has effective degrees of freedom approximately , not .
Why This Matters: Ergodic Capacity of Fading Channels
For a stationary ergodic fading channel with gain and noise power , the ergodic capacity is . Ergodicity guarantees that a single codeword spanning many fading realizations (i.e., codeword length coherence time) experiences all channel states, and the achievable rate equals this ensemble average. When the codeword length is comparable to the coherence time, we enter the outage regime, and ergodic capacity no longer applies β a fundamentally different analysis is needed.
See full treatment in Chapter 14
Historical Note: Birkhoff's Ergodic Theorem
1931George David Birkhoff proved his pointwise ergodic theorem in 1931, establishing that time averages of integrable functions converge almost surely for measure-preserving transformations. The term "ergodic" comes from the Greek ergon (work) and hodos (path), coined by Boltzmann in statistical mechanics to describe systems that visit all accessible states over time. Birkhoff's theorem provided the rigorous mathematical foundation for Boltzmann's physical intuition and, decades later, became the theoretical justification for estimating channel statistics from single observations in communications.
Ergodic vs. Outage Capacity in Fading Channels
Caire and Shamai (1999) provided a unified framework for analyzing fading channels with various levels of channel state information at the transmitter and receiver. Their work clarified when ergodic capacity (the ensemble average ) applies versus when outage-based metrics are appropriate. The distinction hinges on whether the coding block length spans many independent fading realizations (ergodic regime) or few (quasi-static regime). This paper exemplifies how the abstract concept of ergodicity has direct, quantitative implications for system design.
Ergodic Process
A stationary process for which time averages converge to ensemble averages. Mean-ergodicity requires the autocovariance to decay to zero.
Related: Wide-Sense Stationary (WSS), Strict-Sense Stationary (SSS)
Time Average
. The average of a single realization over a time window. Converges to the ensemble mean for ergodic processes.
Related: Ergodic Process
Key Takeaway
Ergodicity bridges the gap between mathematical expectation (over the ensemble) and practical estimation (from a single time series). A WSS process is mean-ergodic if its autocovariance decays to zero β a condition satisfied by most physical processes with finite memory. Without ergodicity, statistical estimation from a single observation is fundamentally impossible.