Ferkans — Interactive Telecom Tutor

Real Noise Is Colored

The AWGN assumption is an idealisation. Amplifier noise is flat over receiver bandwidth but coloured by filter frequency responses; antenna arrays see co-channel interference with strong spatial structure; radar targets sit in clutter with long correlation times. Whenever the noise covariance is not a scaled identity, the correlator of Section 2.1 ceases to be optimal. A single linear change of coordinates --- pre-whitening --- restores the AWGN machinery.

Definition:
Detection in Colored Gaussian Noise

Let $\mathbf{C}_w \in \mathbb{R}^{n\times n}$ be a positive-definite covariance matrix. Consider

$\mathcal{H}_0: \mathbf{y} = \mathbf{w}, \qquad \mathcal{H}_1: \mathbf{y} = \mathbf{s} + \mathbf{w},$

with $\mathbf{w}\sim\mathcal{N}(\mathbf{0}, \mathbf{C}_w)$ . This is the known-signal detection problem in colored Gaussian noise.

Definition:
Whitening Transform

Let $\mathbf{C}_w = \mathbf{L}\mathbf{L}^{\mathsf{T}}$ be the Cholesky factorisation. The linear map

$\mathbf{y} \mapsto \widetilde{\mathbf{y}} = \mathbf{L}^{-1}\mathbf{y}$

is called the whitening transform. Applied to the colored-noise vector $\mathbf{w}$ , it yields $\mathbf{L}^{-1}\mathbf{w}\sim\mathcal{N}(\mathbf{0},\mathbf{I})$ .

Any square factor $\mathbf{M}$ with $\mathbf{M}\mathbf{M}^{\mathsf{T}} = \mathbf{C}_w$ (e.g., $\mathbf{M} = \mathbf{U}\boldsymbol{\Lambda}^{1/2}$ from the eigendecomposition) whitens equally well. Cholesky is the canonical triangular choice.

Definition:
Mahalanobis Norm

For positive-definite $\mathbf{C}$ , the Mahalanobis norm of a vector $\mathbf{v}$ is

$\|\mathbf{v}\|_{\mathbf{C}} \;=\; \sqrt{\mathbf{v}^{\mathsf{T}}\mathbf{C}^{-1}\mathbf{v}},$

and the Mahalanobis inner product of $\mathbf{u},\mathbf{v}$ is $\langle\mathbf{u},\mathbf{v}\rangle_{\mathbf{C}} = \mathbf{u}^{\mathsf{T}}\mathbf{C}^{-1}\mathbf{v}$ .

The Mahalanobis norm is the Euclidean norm in the whitened coordinate system. Geometrically, the locus $\|\mathbf{v}-\boldsymbol{\mu}\|_{\mathbf{C}} = r$ is the ellipsoid with principal axes aligned with the eigenvectors of $\mathbf{C}$ .

Theorem: LRT in Colored Gaussian Noise

For the detection problem in Definition DDetection in Colored Gaussian Noise, the log-likelihood ratio is

$\ell(\mathbf{y}) \;=\; \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y} \;-\; \tfrac{1}{2}\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}.$

The Neyman--Pearson test is equivalent to thresholding the pre-whitened correlator

$T(\mathbf{y}) \;=\; \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y},$

and its deflection coefficient is the squared Mahalanobis norm

$d^2 \;=\; \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s} \;=\; \|\mathbf{s}\|_{\mathbf{C}_w}^2.$

The inverse covariance $\mathbf{C}_w^{-1}$ assigns low weight to directions where the noise is strong and high weight to directions where the noise is weak --- a coloured-noise generalisation of inverse-variance weighting.

Proof

Write the two multivariate Gaussian densities.

$p_j(\mathbf{y}) = \frac{1}{(2\pi)^{n/2}\det(\mathbf{C}_w)^{1/2}} \exp\!\Bigl(-\tfrac{1}{2}(\mathbf{y}-\boldsymbol{\mu}_j)^{\mathsf{T}}\mathbf{C}_w^{-1}(\mathbf{y}-\boldsymbol{\mu}_j)\Bigr),$ $with$ \boldsymbol{\mu}_0 = \mathbf{0} $,$ \boldsymbol{\mu}_1 = \mathbf{s}$.

Log-ratio.

The normalising constants cancel:

$\ell(\mathbf{y}) = -\tfrac{1}{2}\bigl[(\mathbf{y}-\mathbf{s})^{\mathsf{T}}\mathbf{C}_w^{-1}(\mathbf{y}-\mathbf{s}) - \mathbf{y}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y}\bigr].$

Expand and simplify.

$(\mathbf{y}-\mathbf{s})^{\mathsf{T}}\mathbf{C}_w^{-1}(\mathbf{y}-\mathbf{s}) = \mathbf{y}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y} - 2\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y} + \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}. KATEXPLACEHOLDER0END\ell(\mathbf{y}) = \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y} - \tfrac{1}{2}\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}.$ $

Deflection coefficient.

Under $\mathcal{H}_0$ , $T(\mathbf{y})=\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{w}$ is Gaussian with mean $0$ and variance $\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{C}_w\mathbf{C}_w^{-1}\mathbf{s} = \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}$ . Under $\mathcal{H}_1$ , $T$ has the same variance but mean $\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}$ . Therefore

$d^2 = \frac{(\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s})^2}{\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}} = \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}. \quad\blacksquare$

Theorem: Whitening Yields an Equivalent AWGN Problem

Let $\widetilde{\mathbf{y}} = \mathbf{L}^{-1}\mathbf{y}$ , $\widetilde{\mathbf{s}} = \mathbf{L}^{-1}\mathbf{s}$ where $\mathbf{C}_w = \mathbf{L}\mathbf{L}^{\mathsf{T}}$ . Then the transformed problem

$\widetilde{\mathcal{H}}_0: \widetilde{\mathbf{y}} = \widetilde{\mathbf{w}}, \qquad \widetilde{\mathcal{H}}_1: \widetilde{\mathbf{y}} = \widetilde{\mathbf{s}} + \widetilde{\mathbf{w}},$

has $\widetilde{\mathbf{w}}\sim\mathcal{N}(\mathbf{0},\mathbf{I})$ , and the AWGN correlator on the whitened data coincides with the Mahalanobis correlator on the original data.

Proof

Distribution of $\widetilde{\mathbf{w}}$.

$\widetilde{\mathbf{w}} = \mathbf{L}^{-1}\mathbf{w}$ , so

$\mathrm{Cov}(\widetilde{\mathbf{w}}) = \mathbf{L}^{-1}\mathbf{C}_w\mathbf{L}^{-\mathsf{T}} = \mathbf{L}^{-1}\mathbf{L}\mathbf{L}^{\mathsf{T}}\mathbf{L}^{-\mathsf{T}} = \mathbf{I}.$

AWGN statistic on whitened data.

The unit-noise AWGN correlator is $\widetilde{T}(\widetilde{\mathbf{y}}) = \widetilde{\mathbf{s}}^{\mathsf{T}}\widetilde{\mathbf{y}}$ .

Unwind the transform.

$\widetilde{T} = (\mathbf{L}^{-1}\mathbf{s})^{\mathsf{T}}(\mathbf{L}^{-1}\mathbf{y}) = \mathbf{s}^{\mathsf{T}}\mathbf{L}^{-\mathsf{T}}\mathbf{L}^{-1}\mathbf{y} = \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y} = T(\mathbf{y}).$ $

Conclusion.

Detection in colored noise and detection in white noise after a pre-whitening filter give numerically identical decisions. The pre-whitened matched filter is therefore the Neyman--Pearson test for the colored-noise case. $\blacksquare$

Example: AR(1) Noise: Deflection Loss

Two observations $y_1, y_2$ are corrupted by noise with covariance $\mathbf{C}_w = \sigma^2\begin{pmatrix} 1 & \rho \\ \rho & 1 \end{pmatrix}$ and the signal is $\mathbf{s} = (1,1)^{\mathsf{T}}$ . Compute the deflection coefficient of (i) the naive AWGN correlator that ignores correlation and (ii) the optimal Mahalanobis correlator, and compare as a function of $\rho$ .

Solution

Invert $\mathbf{C}_w$.

$\mathbf{C}_w^{-1} = \frac{1}{\sigma^2(1-\rho^2)}\begin{pmatrix} 1 & -\rho \\ -\rho & 1 \end{pmatrix}.$ $

Optimal deflection.

$d_{\mathrm{opt}}^2 = \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s} = \frac{(1-\rho)+(1-\rho)}{\sigma^2(1-\rho^2)}\cdot\frac{2}{1}... = \frac{2(1-\rho)}{\sigma^2(1-\rho^2)} = \frac{2}{\sigma^2(1+\rho)}.$ $

Naive AWGN correlator deflection.

The mismatched statistic is $T_{\mathrm{naive}} = y_1 + y_2$ . $\mathbb{E}_1[T_{\mathrm{naive}}] - \mathbb{E}_0[T_{\mathrm{naive}}] = 2$ . Its variance under $\mathcal{H}_0$ is $\text{Var}(y_1+y_2) = 2\sigma^2(1+\rho)$ . Hence

$d_{\mathrm{naive}}^2 = \frac{4}{2\sigma^2(1+\rho)} = \frac{2}{\sigma^2(1+\rho)}.$

Observation.

For $\mathbf{s} = (1,1)^{\mathsf{T}}$ the two coincide --- $d_{\mathrm{opt}}^2 = d_{\mathrm{naive}}^2$ . The signal happens to lie along the principal eigenvector of $\mathbf{C}_w$ , so whitening changes only its magnitude, not its direction. The reader should verify that for $\mathbf{s} = (1,-1)^{\mathsf{T}}$ the mismatched detector instead achieves deflection $2/(\sigma^2(1-\rho))$ , an improvement over the $\rho=0$ case: the noise is anti-correlated with the difference signal, actively helping detection.

Example: The Mahalanobis Ellipsoid

Let $\mathbf{C} = \begin{pmatrix} 4 & 2 \\ 2 & 3 \end{pmatrix}$ . Describe the Mahalanobis unit sphere $\{\mathbf{v}: \|\mathbf{v}\|_{\mathbf{C}} = 1\}$ .

Solution

Compute $\mathbf{C}^{-1}$.

$\det\mathbf{C} = 12 - 4 = 8$ , so

$\mathbf{C}^{-1} = \frac{1}{8}\begin{pmatrix} 3 & -2 \\ -2 & 4 \end{pmatrix}.$

Write the defining equation.

$\mathbf{v}^{\mathsf{T}}\mathbf{C}^{-1}\mathbf{v} = 1$ expands to

$\tfrac{3}{8}v_1^2 - \tfrac{1}{2}v_1 v_2 + \tfrac{1}{2}v_2^2 = 1.$

Diagonalise.

Eigenvalues of $\mathbf{C}$ are $\lambda_\pm = \tfrac{7}{2} \pm \tfrac{\sqrt{17}}{2}$ , giving principal-axis lengths $\sqrt{\lambda_+} \approx 2.29$ and $\sqrt{\lambda_-} \approx 0.87$ . The Mahalanobis unit sphere is an ellipse with these semi-axis lengths.

Pre-Whitening a Colored Noise Process

Empirical PSD of an AR(1) noise process $w_k = \rho w_{k-1} + e_k$ before and after pre-whitening by $\mathbf{L}^{-1}$ . The whitened process has an approximately flat PSD, matching the ideal white reference.

Parameters

AR(1) correlation

\rho

0.85

Process length128

Random seed3

Pre-Whitened Matched-Filter Detector

Complexity:

\mathcal{O}(n^3)

Cholesky,

\mathcal{O}(n^2)

per detection once

\mathbf{L}

is precomputed.

INPUT: y (observation vector, length n)

s (signal template, length n)

C_w (noise covariance, n x n, positive definite)

gamma (threshold)

OUTPUT: decision d in {0, 1}

1. L <- cholesky(C_w) # C_w = L L^T

2. z <- forward_solve(L, s) # z = L^{-1} s

3. u <- forward_solve(L, y) # u = L^{-1} y

4. T <- dot(z, u) # Mahalanobis correlator

5. if T > gamma: return 1

6. else: return 0

In practice, $\mathbf{C}_w$ is estimated from secondary data and $\mathbf{L}$ is reused across many observations. When $\mathbf{C}_w$ is Toeplitz (stationary noise), Levinson--Durbin recursion replaces Cholesky with $\mathcal{O}(n^2)$ cost.

Common Mistake: Do Not Whiten the Signal Twice

Mistake:

"To get the whitened correlator, I compute $\\widetilde{\\mathbf{s}} = \\mathbf{L}^{-1}\\mathbf{s}$ and $\\widetilde{\\mathbf{y}} = \\mathbf{L}^{-1}\\mathbf{y}$ and form $\\widetilde{\\mathbf{s}}^{\\mathsf{T}}\\mathbf{C}_w^{-1}\\widetilde{\\mathbf{y}}$ ."

Correction:

Either (a) whiten both vectors and form the ordinary inner product $\widetilde{\mathbf{s}}^{\mathsf{T}}\widetilde{\mathbf{y}}$ , or (b) leave both vectors in the original coordinates and compute the Mahalanobis inner product $\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y}$ . Do not mix the two, or you will end up with $\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-2}\mathbf{y}$ --- a different statistic with reduced deflection.

Common Mistake: Ill-Conditioned Covariance Matrices

Mistake:

"I inverted $\\mathbf{C}_w$ and the detection probability exploded --- my optimal detector is clearly broken."

Correction:

A near-singular $\mathbf{C}_w$ indicates that some direction in observation space carries essentially no noise. The optimal detector exploits this (and, on paper, achieves arbitrary deflection) but is catastrophically sensitive to modelling error: tiny perturbations in $\widehat{\mathbf{C}}_w$ move the null space around. Diagonal loading $\mathbf{C}_w \leftarrow \mathbf{C}_w + \epsilon\mathbf{I}$ is the standard remedy; it shrinks the deflection but bounds the sensitivity.

Quick Check

The Mahalanobis squared distance $\|\mathbf{v}\|_{\mathbf{C}}^2$ equals:

$\mathbf{v}^{\mathsf{T}}\mathbf{v}$

$\mathbf{v}^{\mathsf{T}}\mathbf{C}\mathbf{v}$

$\mathbf{v}^{\mathsf{T}}\mathbf{C}^{-1}\mathbf{v}$

$\|\mathbf{v}\|^2 / \det(\mathbf{C})$

Correction:

\mathbf{v}^{\mathsf{T}}\mathbf{C}^{-1}\mathbf{v}

Correct: the squared Euclidean norm of the whitened vector $\mathbf{L}^{-1}\mathbf{v}$ .

White vs. Colored Gaussian Detection

Quantity	White ( $\mathbf{C}_w=\sigma^2\mathbf{I}$ )	Colored ( $\mathbf{C}_w$ )
Test statistic	$\mathbf{s}^{\mathsf{T}}\mathbf{y}$	$\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y}$
Deflection $d^2$	$E_s/\sigma^2$	$\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}$
Geometry	Sphere	Ellipsoid
Pre-processing	None	Cholesky $\mathbf{L}^{-1}$
Complexity (per detection)	$\mathcal{O}(n)$	$\mathcal{O}(n^2)$ after $\mathcal{O}(n^3)$ setup

Historical Note: Mahalanobis and the Whitening Transform

1930s--1950s

Prasanta Chandra Mahalanobis introduced the distance that bears his name in a 1936 paper in the Proceedings of the National Institute of Sciences of India, motivated by the anthropometric classification of Indian tribal groups. His insight --- that one should measure distance relative to the covariance of the background population --- anticipated the engineering whitening transform by two decades. The connection to Gaussian detection was made rigorous in the statistical signal processing literature of the 1950s, notably by Kailath's work on innovations representations.

⚠️Engineering Note

Space-Time Adaptive Processing (STAP)

Airborne radars observe ground clutter whose Doppler spectrum is a linear function of antenna angle --- clutter lives on a ridge in the angle-Doppler plane. A STAP processor estimates the clutter-plus-noise covariance $\widehat{\mathbf{C}}_w$ from secondary range cells and then applies the pre-whitened matched filter $\widehat{\mathbf{C}}_w^{-1}\mathbf{s}(f_d,\theta)$ for each Doppler--angle hypothesis. The approach can yield 30+ dB of clutter suppression relative to a per-channel matched filter. The price: if the covariance matrix has dimension $N_pN_a$ (pulses $\times$ antennas), the required secondary-data support scales as $\mathcal{O}(N_pN_a)$ (Reed--Mallett--Brennan rule).

Practical Constraints

•
Secondary-data homogeneity assumption
•
Clutter-Doppler aliasing when PRF < clutter bandwidth
•
Computational budget: $\mathcal{O}((N_pN_a)^3)$ per CPI

📋 Ref: MIT Lincoln Laboratory STAP technical reports (Ward, 1994)

Why This Matters: Interference Whitening in MIMO Receivers

A 5G NR MIMO receiver sees a colored noise-plus-interference vector at its $N_r$ antennas. The LMMSE-IRC (interference rejection combining) receiver estimates $\mathbf{C}_{w+i}$ from demodulation reference signals and applies exactly the Mahalanobis detector of Theorem TLRT in Colored Gaussian Noise to each resource element. The spatial covariance structure makes inter-cell interference predictable and defeats it with the same mathematics introduced in 1950s radar.

Whitening transform

A linear map that turns a coloured Gaussian vector with covariance $\mathbf{C}$ into an uncorrelated Gaussian vector with identity covariance. Canonically $\mathbf{L}^{-1}$ where $\mathbf{C}=\mathbf{L}\mathbf{L}^{\mathsf{T}}$ .

Mahalanobis distance

The distance $\|\mathbf{u}-\mathbf{v}\|_{\mathbf{C}}$ between two points measured in the metric induced by $\mathbf{C}^{-1}$ . Equal to the Euclidean distance between their whitened images.

Colored Gaussian Noise and Pre-Whitening

Real Noise Is Colored

Definition: Detection in Colored Gaussian Noise

Definition: Whitening Transform

Definition: Mahalanobis Norm

Theorem: LRT in Colored Gaussian Noise

Write the two multivariate Gaussian densities.

Log-ratio.

Expand and simplify.

Deflection coefficient.

Theorem: Whitening Yields an Equivalent AWGN Problem

Distribution of $\widetilde{\mathbf{w}}$.

AWGN statistic on whitened data.

Unwind the transform.

Conclusion.

Example: AR(1) Noise: Deflection Loss

Invert $\mathbf{C}_w$.

Optimal deflection.

Naive AWGN correlator deflection.

Observation.

Example: The Mahalanobis Ellipsoid

Compute $\mathbf{C}^{-1}$.

Write the defining equation.

Diagonalise.

Pre-Whitening a Colored Noise Process

Parameters

Pre-Whitened Matched-Filter Detector

Common Mistake: Do Not Whiten the Signal Twice

Common Mistake: Ill-Conditioned Covariance Matrices

Quick Check

White vs. Colored Gaussian Detection

Historical Note: Mahalanobis and the Whitening Transform

Space-Time Adaptive Processing (STAP)

Why This Matters: Interference Whitening in MIMO Receivers

Whitening transform

Mahalanobis distance

Definition:
Detection in Colored Gaussian Noise

Definition:
Whitening Transform

Definition:
Mahalanobis Norm