Ferkans — Interactive Telecom Tutor

Why Stochastic Processes for Communications?

A communication signal is not a single number drawn from a probability distribution --- it is a random function of time. When we write the received baseband waveform

$r(t) = s(t) + n(t),$

neither the noise $n(t)$ nor, in a fading channel, the signal $s(t) = h(t)\,x(t)$ is deterministic. At each instant $t$ these are random variables, but they also possess temporal structure: the noise sample at time $t$ is correlated with the sample at $t + \tau$ if $\tau$ is small enough, and the channel gain $h(t)$ drifts according to the Doppler spread.

A single random variable captures the statistics of one observation. A random vector captures finitely many observations. A stochastic process captures the statistics of an entire time-indexed family of random variables and is therefore the natural mathematical language for signals, noise, interference, and channels.

This section introduces the foundational concepts:

Stationarity (when does the statistical character of a process not change with time?).
Autocorrelation and power spectral density (how is power distributed across frequency?).
Ergodicity (when can we replace ensemble averages with time averages?).
LTI filtering of random processes (the key link between signal processing and probability).

These tools are prerequisites for everything that follows: noise analysis, matched-filter detection, Wiener filtering, and the capacity of band-limited channels.

Definition:
Stochastic Process

A stochastic (random) process is a family of random variables

$\{X(t),\; t \in T\}$

defined on a common probability space $(\Omega, \mathcal{F}, P)$ and indexed by a parameter $t$ belonging to an index set $T$ .

Interpretations and terminology:

Fixed $t$ , varying $\omega$ : For a fixed time $t_0 \in T$ , $X(t_0) = X(t_0, \omega)$ is an ordinary random variable on $(\Omega, \mathcal{F}, P)$ .
Fixed $\omega$ , varying $t$ : For a fixed outcome $\omega_0 \in \Omega$ , the function $t \mapsto X(t, \omega_0)$ is a deterministic function of time called a sample path (or realisation) of the process.
Both varying: The full object $X(t, \omega)$ is a function of two variables --- time and randomness.

Classification by index set:

Index set $T$	Name	Notation
$T = \mathbb{R}$ (or an interval)	Continuous-time process	$X(t)$
$T = \mathbb{Z}$ (or $\mathbb{N}_0$ )	Discrete-time process	$X[n]$

Classification by state space:

Continuous-valued: $X(t) \in \mathbb{R}$ or $\mathbb{C}$ (e.g., thermal noise voltage).
Discrete-valued: $X(t) \in \{s_1, s_2, \ldots\}$ (e.g., the state of a Markov chain modelling a fading channel).

In this text, unless stated otherwise, $X(t)$ denotes a continuous-time, complex-valued stochastic process.

In telecommunications, the most common stochastic processes are: (i) additive white Gaussian noise (AWGN), (ii) the time-varying channel gain $h(t)$ in a fading environment, and (iii) the information-bearing signal $x(t)$ itself, which is modelled as random to apply information-theoretic results. Each of these is a family of random variables indexed by continuous time.

,

Definition:
Mean, Autocorrelation, and Autocovariance Functions

Let $\{X(t),\; t \in T\}$ be a stochastic process.

Mean function: $\mu_X(t) = E[X(t)].$

Autocorrelation function: $R_X(t_1, t_2) = E[X(t_1)\,X^*(t_2)],$ where $^*$ denotes complex conjugation.

Autocovariance function: $C_X(t_1, t_2) = E\bigl[(X(t_1) - \mu_X(t_1))(X(t_2) - \mu_X(t_2))^*\bigr] = R_X(t_1, t_2) - \mu_X(t_1)\,\mu_X^*(t_2).$

Cross-correlation between two processes $X(t)$ and $Y(t)$ : $R_{XY}(t_1, t_2) = E[X(t_1)\,Y^*(t_2)].$

The autocorrelation function is the fundamental second-order descriptor of a stochastic process. It captures how the process at time $t_1$ is statistically related to the process at time $t_2$ .

The complex conjugation in $E[X(t_1)\,X^*(t_2)]$ ensures that $R_X(t,t) = E[|X(t)|^2] \geq 0$ , which we interpret as the instantaneous power of the process at time $t$ . For real-valued processes, the conjugation has no effect and may be dropped.

,

Definition:
Strict-Sense Stationarity (SSS)

A stochastic process $\{X(t),\; t \in T\}$ is strict-sense stationary (SSS) if its complete statistical description is invariant under time shifts. Formally, for every positive integer $n$ , every set of time instants $t_1, t_2, \ldots, t_n \in T$ , and every time shift $\tau$ such that $t_1+\tau, \ldots, t_n+\tau \in T$ , the joint distribution of $(X(t_1+\tau), X(t_2+\tau), \ldots, X(t_n+\tau))$ is identical to that of $(X(t_1), X(t_2), \ldots, X(t_n))$ :

$F_{X(t_1+\tau),\ldots,X(t_n+\tau)}(x_1,\ldots,x_n) = F_{X(t_1),\ldots,X(t_n)}(x_1,\ldots,x_n)$

for all $(x_1, \ldots, x_n) \in \mathbb{R}^n$ (or $\mathbb{C}^n$ ) and all valid $\tau$ .

Consequences of SSS:

Setting $n = 1$ : the first-order distribution $F_{X(t)}$ does not depend on $t$ . In particular, $\mu_X(t) = \mu$ (constant mean) and $\mathrm{Var}(X(t))$ is constant.
Setting $n = 2$ : the joint distribution of $(X(t_1), X(t_2))$ depends only on the difference $t_1 - t_2$ , not on the absolute times. Hence $R_X(t_1, t_2) = R_X(t_1 - t_2)$ .

SSS is a very strong condition: it requires all finite-dimensional distributions to be time-invariant. In practice it is rarely verified directly, and the weaker notion of wide-sense stationarity is used instead.

Strict-sense stationarity implies wide-sense stationarity (WSS) but the converse is false in general. The one important exception is the Gaussian process: because a Gaussian process is completely determined by its mean and autocorrelation, a Gaussian WSS process is automatically SSS.

,

Definition:
Wide-Sense Stationarity (WSS)

A stochastic process $\{X(t)\}$ is wide-sense stationary (WSS) if it satisfies two conditions:

Constant mean: $\mu_X(t) = E[X(t)] = \mu \quad \text{for all } t.$
Autocorrelation depends only on the time difference (lag): $R_X(t_1, t_2) = R_X(t_1 - t_2).$ Writing $\tau = t_1 - t_2$ : $R_X(\tau) = E[X(t + \tau)\,X^*(t)] \quad \text{for all } t.$

Key properties of the WSS autocorrelation function $R_X(\tau)$ :

$R_X(0) \geq 0$ : Since $R_X(0) = E[|X(t)|^2]$ is the average power of the process.
Maximum at the origin: $|R_X(\tau)| \leq R_X(0)$ for all $\tau$ .
Hermitian symmetry: $R_X(-\tau) = R_X^*(\tau)$ . For real-valued processes this simplifies to $R_X(-\tau) = R_X(\tau)$ (even symmetry).
Positive semidefiniteness: For any set of times $t_1, \ldots, t_n$ and complex coefficients $a_1, \ldots, a_n$ : $\sum_{i=1}^{n}\sum_{k=1}^{n} a_i\,a_k^*\,R_X(t_i - t_k) \geq 0.$

Wide-sense stationarity is the "working assumption" throughout signal processing and communications. It is much easier to verify than SSS (only the first two moments must be checked), and it suffices for all linear processing operations: matched filtering, Wiener filtering, linear MMSE estimation, and spectral analysis all require only the mean and autocorrelation function.

, ,

Theorem: Properties of the WSS Autocorrelation Function

Let $\{X(t)\}$ be a WSS process with autocorrelation $R_X(\tau)$ . Then:

$R_X(0) \geq 0$ .
$R_X(-\tau) = R_X^*(\tau)$ for all $\tau$ (Hermitian symmetry).
$|R_X(\tau)| \leq R_X(0)$ for all $\tau$ (maximum at the origin).

Property 1 says average power is nonnegative. Property 2 reflects the conjugate symmetry inherent in the inner product $E[X(t+\tau)X^*(t)]$ . Property 3 states that a process is most correlated with itself at zero lag --- intuitively, the best predictor of $X(t)$ is $X(t)$ itself.

Show Hint

For property 1, evaluate $R_X(\tau)$ at $\tau = 0$ .

For property 2, swap the roles of $t_1$ and $t_2$ in the definition.

For property 3, use the Cauchy-Schwarz inequality.

Proof

Proof of (1): $R_X(0) \geq 0$

$R_X(0) = E[X(t)\,X^*(t)] = E[|X(t)|^2] \geq 0,$ $since$ |X(t)|^2 \geq 0 $for every realisation.$ \square$

Proof of (2): Hermitian symmetry

By the WSS assumption, $R_X(\tau) = E[X(t+\tau)\,X^*(t)]$ for all $t$ . Replacing $\tau$ with $-\tau$ : $R_X(-\tau) = E[X(t-\tau)\,X^*(t)].$ Now set $t' = t - \tau$ (a valid relabelling since WSS holds for all $t$ ): $R_X(-\tau) = E[X(t')\,X^*(t'+\tau)] = \bigl(E[X(t'+\tau)\,X^*(t')]\bigr)^* = R_X^*(\tau).$ The conjugation arises because $E[AB^*]^* = E[A^*B] = E[BA^*]^*$ applied with $A = X(t'+\tau)$ , $B = X(t')$ . $\square$

Proof of (3): Maximum at the origin

,

Definition:
Power Spectral Density (PSD)

Let $\{X(t)\}$ be a WSS process with autocorrelation function $R_X(\tau)$ . The power spectral density (PSD) of $X(t)$ is the Fourier transform of $R_X(\tau)$ :

$S_X(f) = \int_{-\infty}^{\infty} R_X(\tau)\,e^{-j2\pi f\tau}\,d\tau.$

Conversely, the autocorrelation function is recovered by the inverse Fourier transform:

$R_X(\tau) = \int_{-\infty}^{\infty} S_X(f)\,e^{j2\pi f\tau}\,df.$

Units: If $X(t)$ has units of volts (V), then $R_X(\tau)$ has units of $\mathrm{V}^2$ and $S_X(f)$ has units of $\mathrm{V}^2/\mathrm{Hz}$ (watts per hertz for a 1-ohm load).

Key properties of the PSD:

$S_X(f) \geq 0$ for all $f$ (nonnegative).
For real-valued processes: $S_X(-f) = S_X(f)$ (even symmetry).
For complex-valued processes: $S_X(-f) = S_X^*(f)$ in general, but if the process is circularly symmetric, $S_X(f)$ is real and nonnegative.

The PSD tells us how the average power of the process is distributed across frequency. In communications, the PSD of the transmitted signal determines the occupied bandwidth and hence the spectral efficiency, while the PSD of the noise determines the noise power in any given frequency band.

, ,

Theorem: Wiener--Khinchin Theorem

Let $\{X(t)\}$ be a WSS process with autocorrelation function $R_X(\tau)$ . Then the power spectral density $S_X(f)$ and the autocorrelation $R_X(\tau)$ form a Fourier transform pair:

$S_X(f) = \mathcal{F}\{R_X\}(f) = \int_{-\infty}^{\infty} R_X(\tau)\,e^{-j2\pi f\tau}\,d\tau,$

$R_X(\tau) = \mathcal{F}^{-1}\{S_X\}(\tau) = \int_{-\infty}^{\infty} S_X(f)\,e^{j2\pi f\tau}\,df.$

Moreover:

Nonnegativity: $S_X(f) \geq 0$ for all $f$ .
Average power: Setting $\tau = 0$ in the inverse relation: $P_X = R_X(0) = E[|X(t)|^2] = \int_{-\infty}^{\infty} S_X(f)\,df.$ The total area under the PSD equals the average power of the process.

The Wiener--Khinchin theorem is the stochastic analog of Parseval's theorem for deterministic signals. Just as the energy of a deterministic signal can be computed by integrating its energy spectral density $|X(f)|^2$ , the average power of a WSS random process can be computed by integrating its PSD $S_X(f)$ . The PSD replaces the (non-existent) Fourier transform of a random signal with a well-defined, deterministic spectral description.

Show Hint

The Fourier transform pair is essentially a definition (given in Def. def-psd); the nontrivial content is the nonnegativity of $S_X(f)$ .

For nonnegativity, use the positive-semidefiniteness of $R_X(\tau)$ .

Proof

Proof sketch: nonnegativity of $S_X(f)$

The autocorrelation $R_X(\tau)$ of a WSS process is a positive-semidefinite function, meaning that for any $n$ , any times $t_1, \ldots, t_n$ , and any complex weights $a_1, \ldots, a_n$ : $\sum_{i=1}^{n}\sum_{k=1}^{n} a_i\,a_k^*\,R_X(t_i - t_k) \geq 0.$ This is verified directly: $\sum_{i,k} a_i a_k^* R_X(t_i - t_k) = \sum_{i,k} a_i a_k^* E[X(t_i)X^*(t_k)] = E\!\left[\left|\sum_i a_i X(t_i)\right|^2\right] \geq 0.$

By Bochner's theorem (a classical result in harmonic analysis), a continuous positive-semidefinite function has a nonnegative Fourier transform. Therefore $S_X(f) \geq 0$ for all $f$ .

Average power relation

Setting $\tau = 0$ in the inverse Fourier transform: $R_X(0) = \int_{-\infty}^{\infty} S_X(f)\,e^{j2\pi f \cdot 0}\,df = \int_{-\infty}^{\infty} S_X(f)\,df.$ Since $R_X(0) = E[|X(t)|^2] = P_X$ , we obtain $P_X = \int_{-\infty}^{\infty} S_X(f)\,df.$ $\blacksquare$

, ,

Definition:
Ergodicity

A WSS stochastic process $\{X(t)\}$ is ergodic if time averages computed from a single, infinitely long sample path converge to the corresponding ensemble (statistical) averages.

Mean-ergodic: The process is mean-ergodic if the time-averaged mean converges (in mean square) to the ensemble mean: $\langle X(t) \rangle_T \;\triangleq\; \frac{1}{2T}\int_{-T}^{T} X(t)\,dt \;\xrightarrow{T \to \infty}\; E[X(t)] = \mu.$

Autocorrelation-ergodic: The process is autocorrelation-ergodic if $\frac{1}{2T}\int_{-T}^{T} X(t+\tau)\,X^*(t)\,dt \;\xrightarrow{T \to \infty}\; R_X(\tau) \quad \text{for each } \tau.$

A process that is both mean-ergodic and autocorrelation-ergodic is simply called ergodic (in the wide sense).

Sufficient condition for mean-ergodicity: $\lim_{T \to \infty} \frac{1}{2T}\int_{-2T}^{2T} \left(1 - \frac{|\tau|}{2T}\right) C_X(\tau)\,d\tau = 0,$ where $C_X(\tau) = R_X(\tau) - |\mu|^2$ is the autocovariance. In particular, if $C_X(\tau) \to 0$ as $|\tau| \to \infty$ (i.e., the process "forgets" its past), the process is mean-ergodic.

Ergodicity is the bridge between theory and measurement. In practice, we observe one realisation of a stochastic process (e.g., one received signal waveform) and estimate its statistics (mean, power, PSD) by time-averaging. This procedure is justified only if the process is ergodic. Most WSS processes encountered in communications (stationary noise, stationary fading channels observed over time scales much longer than the coherence time) are assumed ergodic, but the assumption must be verified --- see the Pitfall below.

,

Theorem: WSS Process Through a Linear Time-Invariant (LTI) System

Let $\{X(t)\}$ be a WSS process with autocorrelation $R_X(\tau)$ and PSD $S_X(f)$ . Let $h(t)$ be the impulse response of a stable LTI system with frequency response $H(f) = \mathcal{F}\{h\}(f)$ . The output process

$Y(t) = \int_{-\infty}^{\infty} h(\alpha)\,X(t - \alpha)\,d\alpha = (h * X)(t)$

is also WSS, and its statistics are:

Mean: $\mu_Y = \mu_X \cdot H(0).$
Autocorrelation: $R_Y(\tau) = \int_{-\infty}^{\infty}\int_{-\infty}^{\infty} h(\alpha)\,h^*(\beta)\,R_X(\tau - \alpha + \beta)\,d\alpha\,d\beta.$
Power spectral density: $\boxed{S_Y(f) = |H(f)|^2\,S_X(f).}$
Cross-PSD (input--output): $S_{YX}(f) = H(f)\,S_X(f).$
Output power: $P_Y = R_Y(0) = \int_{-\infty}^{\infty} |H(f)|^2\,S_X(f)\,df.$

The PSD filtering rule $S_Y(f) = |H(f)|^2 S_X(f)$ is arguably the single most important result in stochastic signal processing. It says that an LTI filter shapes the power spectrum of a random process by multiplying by the squared magnitude of its frequency response --- exactly as one would expect from the deterministic relation $Y(f) = H(f)X(f)$ , but now applied to power densities rather than amplitude spectra. Phase information in $H(f)$ does not affect the output PSD.

Show Hint

Start from $Y(t) = \int h(\alpha) X(t-\alpha)\,d\alpha$ and compute $R_Y(\tau) = E[Y(t+\tau)Y^*(t)]$ .

Exchange expectation and integration (justified by stability of $h$ ).

Take the Fourier transform of $R_Y(\tau)$ and use the convolution theorem.

Proof

Step 1: Output autocorrelation

$R_Y(\tau) = E[Y(t+\tau)\,Y^*(t)] = E\!\left[\int_{-\infty}^{\infty} h(\alpha)\,X(t+\tau-\alpha)\,d\alpha \cdot \left(\int_{-\infty}^{\infty} h(\beta)\,X(t-\beta)\,d\beta \right)^*\right].KATEXPLACEHOLDER0ENDR_Y(\tau) = \int\!\!\int h(\alpha)\,h^*(\beta)\, E[X(t+\tau-\alpha)\,X^*(t-\beta)]\,d\alpha\,d\beta.KATEXPLACEHOLDER1ENDR_Y(\tau) = \int\!\!\int h(\alpha)\,h^*(\beta)\, R_X(\tau - \alpha + \beta)\,d\alpha\,d\beta.$ $This depends only on$ \tau $, confirming that$ Y(t)$ is WSS.

Step 2: Fourier transform to obtain PSD

Taking the Fourier transform of $R_Y(\tau)$ with respect to $\tau$ : $S_Y(f) = \int_{-\infty}^{\infty} R_Y(\tau)\,e^{-j2\pi f\tau}\,d\tau.$ Substituting the double-integral expression for $R_Y(\tau)$ and exchanging orders of integration: $S_Y(f) = \int h(\alpha)\,e^{-j2\pi f\alpha}\,d\alpha \cdot \int h^*(\beta)\,e^{j2\pi f\beta}\,d\beta \cdot \int R_X(\tau')\,e^{-j2\pi f\tau'}\,d\tau',$ where we substituted $\tau' = \tau - \alpha + \beta$ . Recognising the three integrals: $S_Y(f) = H(f) \cdot H^*(f) \cdot S_X(f) = |H(f)|^2\,S_X(f).$ $\blacksquare$

Step 3: Mean of the output

$\mu_Y = E[Y(t)] = E\!\left[\int h(\alpha)\,X(t-\alpha)\,d\alpha\right] = \int h(\alpha)\,E[X(t-\alpha)]\,d\alpha = \mu_X \int h(\alpha)\,d\alpha = \mu_X\,H(0).$ \square$

Step 4: Output power

Setting $\tau = 0$ in the inverse Fourier transform of $S_Y(f)$ : $P_Y = R_Y(0) = \int_{-\infty}^{\infty} S_Y(f)\,df = \int_{-\infty}^{\infty} |H(f)|^2\,S_X(f)\,df.$ $\square$

, ,

PSD Filtering of a WSS Process Through an LTI System

Visualise the key result $S_Y(f) = |H(f)|^2\,S_X(f)$ . Choose an input PSD shape (white noise, band-limited, or colored/shaped), a filter type (lowpass, bandpass, or raised cosine), and adjust bandwidth and roll-off. The plot shows three panels: (1) the input PSD $S_X(f)$ , (2) the filter power response $|H(f)|^2$ , and (3) the output PSD $S_Y(f)$ . The shaded area under $S_Y(f)$ equals the output power $P_Y$ .

Parameters

Input PSD shape

Filter type

Filter bandwidth

W

(Hz)1

Roll-off factor

\beta

(raised cosine)0.5

Example: Bandpass Filtering of White Noise

White noise $X(t)$ with two-sided PSD $S_X(f) = N_0/2$ (watts/Hz) is passed through an ideal bandpass filter centred at $f_c$ with one-sided bandwidth $W$ (i.e., the filter passes $|f - f_c| \leq W/2$ and $|f + f_c| \leq W/2$ , and rejects everything else).

(a) Find the output PSD $S_Y(f)$ .

(b) Compute the output noise power $P_Y$ .

(c) Define and compute the noise equivalent bandwidth of the filter.

Solution

Step 1: Characterise the filter

The ideal bandpass filter has frequency response magnitude: $|H(f)|^2 = \begin{cases} 1, & |f - f_c| \leq W/2 \text{ or } |f + f_c| \leq W/2, \\ 0, & \text{otherwise}. \end{cases}$ This is a rectangular window of total width $W$ centred at $\pm f_c$ .

Step 2: Output PSD

By the LTI filtering theorem: $S_Y(f) = |H(f)|^2 \, S_X(f) = \begin{cases} N_0/2, & |f - f_c| \leq W/2 \text{ or } |f + f_c| \leq W/2, \\ 0, & \text{otherwise}. \end{cases}$ The output PSD is a flat spectrum confined to the passband of the filter: the noise is now band-limited to bandwidth $W$ around $\pm f_c$ .

Step 3: Output noise power

$P_Y = \int_{-\infty}^{\infty} S_Y(f)\,df = \frac{N_0}{2} \cdot 2W = N_0 W.KATEXPLACEHOLDER0END\boxed{P_{\text{noise}} = N_0 W.}$ $

Step 4: Noise equivalent bandwidth

For a general (non-ideal) filter with peak gain $|H(f_c)|^2$ , the noise equivalent bandwidth $B_{\text{eq}}$ is defined as $B_{\text{eq}} = \frac{\displaystyle\int_{-\infty}^{\infty} |H(f)|^2\,df} {2\,|H(f_c)|^2}.$ The factor of 2 in the denominator converts from two-sided to one-sided bandwidth. For the ideal bandpass filter above, $|H(f_c)|^2 = 1$ and $B_{\text{eq}} = \frac{2W}{2 \cdot 1} = W,$ confirming consistency. For realistic filters (e.g., Butterworth, Chebyshev), $B_{\text{eq}}$ is slightly larger than the 3-dB bandwidth.

,

Strict-Sense Stationarity vs. Wide-Sense Stationarity vs. Ergodicity

Property	SSS	WSS	Ergodic
Definition	All finite-dimensional distributions are shift-invariant	Constant mean; autocorrelation depends only on lag $\tau$	Time averages converge to ensemble averages
What it constrains	All joint distributions of $(X(t_1), \ldots, X(t_n))$	Only the first two moments (mean and autocorrelation)	Relationship between time and statistical domains
Implication hierarchy	SSS $\Rightarrow$ WSS (always)	WSS $\not\Rightarrow$ SSS in general	Ergodic $\Rightarrow$ WSS (assumed); WSS $\not\Rightarrow$ Ergodic
Gaussian exception	For Gaussian processes: WSS $\Leftrightarrow$ SSS	Same (second-order statistics fully determine all distributions)	Gaussian WSS with $C_X(\tau) \to 0$ is ergodic
Practical verification	Extremely difficult (requires all joint distributions)	Moderate (check constant mean and lag-dependent autocorrelation)	Requires checking decay of autocovariance
Wireless example	AWGN is SSS (Gaussian i.i.d. samples)	Stationary fading channel observed over many coherence times	Noise in a long observation window: time-average $\approx$ ensemble average
Failure example	Non-stationary interference (e.g., bursty traffic)	Birth-death process with time-varying rate	$X(t) = A\cos(2\pi f_0 t)$ with random $A$ : WSS but not mean-ergodic if $A$ is constant per realisation
Used for	Theoretical completeness; Gaussian process analysis	PSD, Wiener--Khinchin theorem, LTI filter analysis	Justifying estimation of $\mu$ , $R_X(\tau)$ , $S_X(f)$ from a single realisation

The Wiener-Khinchin Theorem: From Autocorrelation to Power Spectrum

A split-screen animation showing a WSS process realisation, its autocorrelation function

R_X(\tau)

, and the resulting power spectral density

S_X(f)

connected by the Fourier transform.

The Wiener-Khinchin theorem establishes a Fourier-transform pair between the autocorrelation function

R_X(\tau)

and the power spectral density

S_X(f)

. A narrow autocorrelation (fast decorrelation) corresponds to a wide spectrum, and vice versa.

Why This Matters: Power Spectral Density of Digitally Modulated Signals

In a linearly modulated digital communication system, the transmitted baseband signal is

$x(t) = \sum_{n=-\infty}^{\infty} a_n\,p(t - nT_s),$

where $\{a_n\}$ are the (random) data symbols with $E[a_n] = 0$ , $E[|a_n|^2] = \sigma_a^2$ , and $E[a_n a_m^*] = 0$ for $n \neq m$ (uncorrelated symbols), $p(t)$ is the pulse-shaping filter, and $T_s$ is the symbol period.

Treating $\{a_n\}$ as a WSS discrete-time process, the PSD of $x(t)$ is given by

$S_x(f) = \frac{\sigma_a^2}{T_s}\,|P(f)|^2,$

where $P(f) = \mathcal{F}\{p(t)\}$ is the Fourier transform of the pulse shape.

Key implications for system design:

The occupied bandwidth of $x(t)$ is determined entirely by $|P(f)|^2$ . A rectangular pulse gives a $\mathrm{sinc}^2$ PSD with slow spectral roll-off; a raised-cosine pulse confines the spectrum to bandwidth $(1+\beta)/(2T_s)$ with roll-off factor $\beta \in [0,1]$ .
Spectral efficiency (bits/s/Hz) is $\eta = R_b / W$ where $R_b$ is the bit rate and $W$ is the occupied bandwidth. For $M$ -QAM with a raised-cosine pulse: $\eta = \frac{\log_2 M}{1 + \beta} \quad \text{bits/s/Hz}.$
When the modulated signal passes through a channel with transfer function $H_c(f)$ , the received PSD is $S_r(f) = |H_c(f)|^2\,S_x(f)$ , directly applying the LTI filtering theorem of this section.
The noise power in the receiver bandwidth $W$ is $P_n = N_0 W$ , so the received SNR is $\text{SNR} = \frac{P_s}{N_0 W},$ linking the PSD framework to link-budget analysis.

Quick Check

Let $\{X(t)\}$ be a WSS process with autocorrelation $R_X(\tau)$ . Which of the following is guaranteed to be true?

$R_X(\tau) \geq 0$ for all $\tau$

$|R_X(\tau)| \leq R_X(0)$ for all $\tau$

$R_X(\tau)$ is always a monotonically decreasing function of $|\tau|$

$S_X(f)$ can take negative values for some frequencies

Correction:

|R_X(\tau)| \leq R_X(0)

for all

\tau

This follows from the Cauchy--Schwarz inequality applied to the random variables $X(t+\tau)$ and $X(t)$ . The autocorrelation achieves its maximum magnitude at zero lag, where it equals the average power $E[|X(t)|^2]$ .

Quick Check

White noise with PSD $S_X(f) = N_0/2$ is passed through a filter with $|H(f)|^2 = \frac{1}{1 + (f/B)^2}$ (a Lorentzian/first-order RC filter). The total output noise power is:

$N_0 B$

$\frac{\pi}{2} N_0 B$

$\frac{\pi N_0 B}{2}$

$N_0 B / \pi$

Correction:

\frac{\pi N_0 B}{2}

The output power is $P_Y = \int_{-\infty}^{\infty} \frac{N_0/2}{1+(f/B)^2}\,df = \frac{N_0}{2} \cdot B\pi = \frac{\pi N_0 B}{2}$ . The integral evaluates to $B\pi$ using the standard result $\int_{-\infty}^{\infty} \frac{df}{1+(f/B)^2} = \pi B$ . The noise equivalent bandwidth of this filter is $B_{\text{eq}} = \pi B / 2$ .

Quick Check

Consider the process $X(t) = A$ for all $t$ , where $A$ is a random variable with $E[A] = 0$ and $E[A^2] = 1$ . This process is:

WSS and ergodic

WSS but not mean-ergodic

Not WSS

SSS but not WSS

Correction:

WSS but not mean-ergodic

The process is WSS: $E[X(t)] = E[A] = 0$ (constant mean), and $R_X(t_1, t_2) = E[A \cdot A] = 1$ depends only on $\tau$ (it is actually constant). However, $\langle X(t)\rangle_T = A \neq 0 = E[A]$ in general. The time average equals $A$ , which is random, so it does not converge to the ensemble mean. The autocovariance $C_X(\tau) = 1$ does not decay to zero, violating the sufficient condition for mean-ergodicity.

Stochastic Process

A family $\{X(t), t \in T\}$ of random variables defined on a common probability space, indexed by a parameter $t$ (usually time). Each fixed $t$ yields a random variable; each fixed outcome $\omega$ yields a deterministic sample path.

Wide-Sense Stationarity (WSS)

A stochastic process is WSS if its mean is constant and its autocorrelation function depends only on the time lag $\tau = t_1 - t_2$ : $E[X(t)] = \mu$ for all $t$ and $R_X(t_1,t_2) = R_X(t_1 - t_2)$ . WSS is the standard assumption for spectral analysis and LTI filter theory applied to random signals.

Autocorrelation Function

For a WSS process, $R_X(\tau) = E[X(t+\tau)X^*(t)]$ . It measures the statistical similarity between the process at two time instants separated by lag $\tau$ . The Fourier transform of $R_X(\tau)$ is the power spectral density $S_X(f)$ .

Power Spectral Density (PSD)

The Fourier transform of the autocorrelation function of a WSS process: $S_X(f) = \int R_X(\tau) e^{-j2\pi f\tau}\,d\tau$ . It describes the distribution of average power across frequency, is always nonnegative, and satisfies $\int S_X(f)\,df = R_X(0) = E[|X(t)|^2]$ .

Wiener--Khinchin Theorem

The theorem establishing that the PSD and autocorrelation function of a WSS process form a Fourier transform pair: $S_X(f) = \mathcal{F}\{R_X(\tau)\}$ . It guarantees $S_X(f) \geq 0$ and provides the average-power relation $P_X = \int S_X(f)\,df = R_X(0)$ .

Ergodicity

The property that time averages from a single sample path converge to ensemble averages. A mean-ergodic process satisfies $\frac{1}{2T}\int_{-T}^{T} X(t)\,dt \to E[X(t)]$ as $T \to \infty$ . Ergodicity justifies estimating statistical quantities (mean, PSD) from a single long observation of the process.

Common Mistake: Assuming Ergodicity Without Checking

Mistake:

"The process is WSS, so we can estimate its mean and PSD from a single time record."

Correction:

WSS does not imply ergodicity. A classic counterexample is the constant-amplitude process $X(t) = A$ for all $t$ , where $A$ is a zero-mean unit-variance random variable. This process is WSS:

$E[X(t)] = E[A] = 0$ (constant mean).
$R_X(\tau) = E[A^2] = 1$ (depends only on $\tau$ , trivially).

Yet the time average of every sample path is $\langle X(t)\rangle_T = A$ , which is a random variable --- it does not converge to $E[A] = 0$ . The process is WSS but not ergodic.

When is the assumption safe? A sufficient condition for mean-ergodicity is that the autocovariance $C_X(\tau)$ decays to zero as $|\tau| \to \infty$ . Physically, this means the process "forgets" its past --- future samples are asymptotically uncorrelated with past samples. Most stationary noise and interference processes in communications satisfy this condition. However:

Processes with a nonzero DC component (e.g., a random but time-invariant mean) may fail mean-ergodicity.
Processes with persistent periodic components (e.g., $X(t) = A\cos(2\pi f_0 t + \Theta)$ with random $A$ and $\Theta$ ) require separate analysis for each component.

Always verify the decay of $C_X(\tau)$ before replacing ensemble averages with time averages in a measurement or simulation.

Key Takeaway

Two results form the bedrock of noise and signal analysis in communication systems:

Wiener--Khinchin theorem: The PSD and autocorrelation of a WSS process are a Fourier transform pair, and $S_X(f) \geq 0$ . This lets us move freely between the time domain (autocorrelation, correlation times, fading memory) and the frequency domain (bandwidth, spectral shape, noise floors).
LTI filtering rule: $S_Y(f) = |H(f)|^2\,S_X(f)$ . This single equation answers the core question of receiver design: how much noise power appears at the output of a filter? It directly yields:
- Output noise power: $P_n = \int |H(f)|^2 S_n(f)\,df = N_0 B_{\text{eq}}$ for white noise.
- Noise equivalent bandwidth: $B_{\text{eq}} = \frac{1}{2|H(f_{\max})|^2}\int |H(f)|^2\,df$ .
- SNR at the filter output, and hence bit-error-rate performance.

Together, these results transform the abstract notion of a "random waveform" into concrete, computable quantities --- bandwidth, power, and signal-to-noise ratio --- that drive every design decision in a communication link.

The bridge to later chapters:

Matched filtering (Chapter 3): the filter $H(f)$ that maximises output SNR is derived by applying the LTI rule to signal-plus-noise.
Wiener filtering: the filter that minimises mean-square error between the desired and actual output is expressed via $S_X(f)$ and the cross-PSD $S_{XY}(f)$ .
Channel capacity (Chapter 7): the water-filling power allocation across frequency sub-bands is an optimisation over the channel PSD $|H_c(f)|^2 / S_n(f)$ .

Historical Note: Norbert Wiener and Aleksandr Khinchin

The relationship between autocorrelation and power spectrum was established independently by two mathematicians in the early 1930s.

Aleksandr Yakovlevich Khinchin (1894--1959), a Soviet mathematician, proved in 1934 that the autocorrelation function of a stationary process is the Fourier transform of a nonnegative measure (the spectral measure), as a consequence of Bochner's theorem on positive-definite functions.

Norbert Wiener (1894--1964), an American mathematician at MIT, arrived at the same result from a different direction: his 1930 work on "generalised harmonic analysis" defined the power spectrum for functions that are not square-integrable (and hence do not have ordinary Fourier transforms) by using time-averaged periodograms. Wiener later applied this theory to the optimal filtering problem during World War II, resulting in the Wiener filter --- the foundation of statistical signal processing.

The theorem bearing both their names is one of the most widely applied results in engineering. Every spectrum analyser, every noise-figure measurement, and every link-budget calculation implicitly relies on the Wiener--Khinchin correspondence between time-domain correlations and frequency-domain power distributions.

,

Random (Stochastic) Processes — Fundamentals

Why Stochastic Processes for Communications?

Definition: Stochastic Process

Definition: Mean, Autocorrelation, and Autocovariance Functions

Definition: Strict-Sense Stationarity (SSS)

Definition: Wide-Sense Stationarity (WSS)

Theorem: Properties of the WSS Autocorrelation Function

Proof of (1): $R_X(0) \geq 0$

Proof of (2): Hermitian symmetry

Proof of (3): Maximum at the origin

Definition: Power Spectral Density (PSD)

Theorem: Wiener--Khinchin Theorem

Proof sketch: nonnegativity of $S_X(f)$

Average power relation

Definition: Ergodicity

Theorem: WSS Process Through a Linear Time-Invariant (LTI) System

Step 1: Output autocorrelation

Step 2: Fourier transform to obtain PSD

Step 3: Mean of the output

Step 4: Output power

PSD Filtering of a WSS Process Through an LTI System

Parameters

Example: Bandpass Filtering of White Noise

Step 1: Characterise the filter

Step 2: Output PSD

Step 3: Output noise power

Step 4: Noise equivalent bandwidth

Strict-Sense Stationarity vs. Wide-Sense Stationarity vs. Ergodicity

The Wiener-Khinchin Theorem: From Autocorrelation to Power Spectrum

Why This Matters: Power Spectral Density of Digitally Modulated Signals

Quick Check

Quick Check

Quick Check

Stochastic Process

Wide-Sense Stationarity (WSS)

Autocorrelation Function

Power Spectral Density (PSD)

Wiener--Khinchin Theorem

Ergodicity

Common Mistake: Assuming Ergodicity Without Checking

Key Takeaway

Historical Note: Norbert Wiener and Aleksandr Khinchin

Definition:
Stochastic Process

Definition:
Mean, Autocorrelation, and Autocovariance Functions

Definition:
Strict-Sense Stationarity (SSS)

Definition:
Wide-Sense Stationarity (WSS)

Definition:
Power Spectral Density (PSD)

Definition:
Ergodicity