Colored Gaussian Noise and Pre-Whitening

Real Noise Is Colored

The AWGN assumption is an idealisation. Amplifier noise is flat over receiver bandwidth but coloured by filter frequency responses; antenna arrays see co-channel interference with strong spatial structure; radar targets sit in clutter with long correlation times. Whenever the noise covariance is not a scaled identity, the correlator of Section 2.1 ceases to be optimal. A single linear change of coordinates --- pre-whitening --- restores the AWGN machinery.

Definition:

Detection in Colored Gaussian Noise

Let CwRn×n\mathbf{C}_w \in \mathbb{R}^{n\times n} be a positive-definite covariance matrix. Consider

H0:y=w,H1:y=s+w,\mathcal{H}_0: \mathbf{y} = \mathbf{w}, \qquad \mathcal{H}_1: \mathbf{y} = \mathbf{s} + \mathbf{w},

with wN(0,Cw)\mathbf{w}\sim\mathcal{N}(\mathbf{0}, \mathbf{C}_w). This is the known-signal detection problem in colored Gaussian noise.

Definition:

Whitening Transform

Let Cw=LLT\mathbf{C}_w = \mathbf{L}\mathbf{L}^{\mathsf{T}} be the Cholesky factorisation. The linear map

yy~=L1y\mathbf{y} \mapsto \widetilde{\mathbf{y}} = \mathbf{L}^{-1}\mathbf{y}

is called the whitening transform. Applied to the colored-noise vector w\mathbf{w}, it yields L1wN(0,I)\mathbf{L}^{-1}\mathbf{w}\sim\mathcal{N}(\mathbf{0},\mathbf{I}).

Any square factor M\mathbf{M} with MMT=Cw\mathbf{M}\mathbf{M}^{\mathsf{T}} = \mathbf{C}_w (e.g., M=UΛ1/2\mathbf{M} = \mathbf{U}\boldsymbol{\Lambda}^{1/2} from the eigendecomposition) whitens equally well. Cholesky is the canonical triangular choice.

Definition:

Mahalanobis Norm

For positive-definite C\mathbf{C}, the Mahalanobis norm of a vector v\mathbf{v} is

vC  =  vTC1v,\|\mathbf{v}\|_{\mathbf{C}} \;=\; \sqrt{\mathbf{v}^{\mathsf{T}}\mathbf{C}^{-1}\mathbf{v}},

and the Mahalanobis inner product of u,v\mathbf{u},\mathbf{v} is u,vC=uTC1v\langle\mathbf{u},\mathbf{v}\rangle_{\mathbf{C}} = \mathbf{u}^{\mathsf{T}}\mathbf{C}^{-1}\mathbf{v}.

The Mahalanobis norm is the Euclidean norm in the whitened coordinate system. Geometrically, the locus vμC=r\|\mathbf{v}-\boldsymbol{\mu}\|_{\mathbf{C}} = r is the ellipsoid with principal axes aligned with the eigenvectors of C\mathbf{C}.

Theorem: LRT in Colored Gaussian Noise

For the detection problem in Definition DDetection in Colored Gaussian Noise, the log-likelihood ratio is

(y)  =  sTCw1y    12sTCw1s.\ell(\mathbf{y}) \;=\; \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y} \;-\; \tfrac{1}{2}\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}.

The Neyman--Pearson test is equivalent to thresholding the pre-whitened correlator

T(y)  =  sTCw1y,T(\mathbf{y}) \;=\; \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y},

and its deflection coefficient is the squared Mahalanobis norm

d2  =  sTCw1s  =  sCw2.d^2 \;=\; \mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s} \;=\; \|\mathbf{s}\|_{\mathbf{C}_w}^2.

The inverse covariance Cw1\mathbf{C}_w^{-1} assigns low weight to directions where the noise is strong and high weight to directions where the noise is weak --- a coloured-noise generalisation of inverse-variance weighting.

Theorem: Whitening Yields an Equivalent AWGN Problem

Let y~=L1y\widetilde{\mathbf{y}} = \mathbf{L}^{-1}\mathbf{y}, s~=L1s\widetilde{\mathbf{s}} = \mathbf{L}^{-1}\mathbf{s} where Cw=LLT\mathbf{C}_w = \mathbf{L}\mathbf{L}^{\mathsf{T}}. Then the transformed problem

H~0:y~=w~,H~1:y~=s~+w~,\widetilde{\mathcal{H}}_0: \widetilde{\mathbf{y}} = \widetilde{\mathbf{w}}, \qquad \widetilde{\mathcal{H}}_1: \widetilde{\mathbf{y}} = \widetilde{\mathbf{s}} + \widetilde{\mathbf{w}},

has w~N(0,I)\widetilde{\mathbf{w}}\sim\mathcal{N}(\mathbf{0},\mathbf{I}), and the AWGN correlator on the whitened data coincides with the Mahalanobis correlator on the original data.

Example: AR(1) Noise: Deflection Loss

Two observations y1,y2y_1, y_2 are corrupted by noise with covariance Cw=σ2(1ρρ1)\mathbf{C}_w = \sigma^2\begin{pmatrix} 1 & \rho \\ \rho & 1 \end{pmatrix} and the signal is s=(1,1)T\mathbf{s} = (1,1)^{\mathsf{T}}. Compute the deflection coefficient of (i) the naive AWGN correlator that ignores correlation and (ii) the optimal Mahalanobis correlator, and compare as a function of ρ\rho.

Example: The Mahalanobis Ellipsoid

Let C=(4223)\mathbf{C} = \begin{pmatrix} 4 & 2 \\ 2 & 3 \end{pmatrix}. Describe the Mahalanobis unit sphere {v:vC=1}\{\mathbf{v}: \|\mathbf{v}\|_{\mathbf{C}} = 1\}.

Pre-Whitening a Colored Noise Process

Empirical PSD of an AR(1) noise process wk=ρwk1+ekw_k = \rho w_{k-1} + e_k before and after pre-whitening by L1\mathbf{L}^{-1}. The whitened process has an approximately flat PSD, matching the ideal white reference.

Parameters
0.85
128
3

Pre-Whitened Matched-Filter Detector

Complexity: O(n3)\mathcal{O}(n^3) Cholesky, O(n2)\mathcal{O}(n^2) per detection once L\mathbf{L} is precomputed.
INPUT: y (observation vector, length n)
s (signal template, length n)
C_w (noise covariance, n x n, positive definite)
gamma (threshold)
OUTPUT: decision d in {0, 1}
1. L <- cholesky(C_w) # C_w = L L^T
2. z <- forward_solve(L, s) # z = L^{-1} s
3. u <- forward_solve(L, y) # u = L^{-1} y
4. T <- dot(z, u) # Mahalanobis correlator
5. if T > gamma: return 1
6. else: return 0

In practice, Cw\mathbf{C}_w is estimated from secondary data and L\mathbf{L} is reused across many observations. When Cw\mathbf{C}_w is Toeplitz (stationary noise), Levinson--Durbin recursion replaces Cholesky with O(n2)\mathcal{O}(n^2) cost.

Common Mistake: Do Not Whiten the Signal Twice

Mistake:

"To get the whitened correlator, I compute widetildemathbfs=mathbfL1mathbfs\\widetilde{\\mathbf{s}} = \\mathbf{L}^{-1}\\mathbf{s} and widetildemathbfy=mathbfL1mathbfy\\widetilde{\\mathbf{y}} = \\mathbf{L}^{-1}\\mathbf{y} and form widetildemathbfsmathsfTmathbfCw1widetildemathbfy\\widetilde{\\mathbf{s}}^{\\mathsf{T}}\\mathbf{C}_w^{-1}\\widetilde{\\mathbf{y}}."

Correction:

Either (a) whiten both vectors and form the ordinary inner product s~Ty~\widetilde{\mathbf{s}}^{\mathsf{T}}\widetilde{\mathbf{y}}, or (b) leave both vectors in the original coordinates and compute the Mahalanobis inner product sTCw1y\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y}. Do not mix the two, or you will end up with sTCw2y\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-2}\mathbf{y} --- a different statistic with reduced deflection.

Common Mistake: Ill-Conditioned Covariance Matrices

Mistake:

"I inverted mathbfCw\\mathbf{C}_w and the detection probability exploded --- my optimal detector is clearly broken."

Correction:

A near-singular Cw\mathbf{C}_w indicates that some direction in observation space carries essentially no noise. The optimal detector exploits this (and, on paper, achieves arbitrary deflection) but is catastrophically sensitive to modelling error: tiny perturbations in C^w\widehat{\mathbf{C}}_w move the null space around. Diagonal loading CwCw+ϵI\mathbf{C}_w \leftarrow \mathbf{C}_w + \epsilon\mathbf{I} is the standard remedy; it shrinks the deflection but bounds the sensitivity.

Quick Check

The Mahalanobis squared distance vC2\|\mathbf{v}\|_{\mathbf{C}}^2 equals:

vTv\mathbf{v}^{\mathsf{T}}\mathbf{v}

vTCv\mathbf{v}^{\mathsf{T}}\mathbf{C}\mathbf{v}

vTC1v\mathbf{v}^{\mathsf{T}}\mathbf{C}^{-1}\mathbf{v}

v2/det(C)\|\mathbf{v}\|^2 / \det(\mathbf{C})

White vs. Colored Gaussian Detection

QuantityWhite (Cw=σ2I\mathbf{C}_w=\sigma^2\mathbf{I})Colored (Cw\mathbf{C}_w)
Test statisticsTy\mathbf{s}^{\mathsf{T}}\mathbf{y}sTCw1y\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{y}
Deflection d2d^2Es/σ2E_s/\sigma^2sTCw1s\mathbf{s}^{\mathsf{T}}\mathbf{C}_w^{-1}\mathbf{s}
GeometrySphereEllipsoid
Pre-processingNoneCholesky L1\mathbf{L}^{-1}
Complexity (per detection)O(n)\mathcal{O}(n)O(n2)\mathcal{O}(n^2) after O(n3)\mathcal{O}(n^3) setup

Historical Note: Mahalanobis and the Whitening Transform

1930s--1950s

Prasanta Chandra Mahalanobis introduced the distance that bears his name in a 1936 paper in the Proceedings of the National Institute of Sciences of India, motivated by the anthropometric classification of Indian tribal groups. His insight --- that one should measure distance relative to the covariance of the background population --- anticipated the engineering whitening transform by two decades. The connection to Gaussian detection was made rigorous in the statistical signal processing literature of the 1950s, notably by Kailath's work on innovations representations.

⚠️Engineering Note

Space-Time Adaptive Processing (STAP)

Airborne radars observe ground clutter whose Doppler spectrum is a linear function of antenna angle --- clutter lives on a ridge in the angle-Doppler plane. A STAP processor estimates the clutter-plus-noise covariance C^w\widehat{\mathbf{C}}_w from secondary range cells and then applies the pre-whitened matched filter C^w1s(fd,θ)\widehat{\mathbf{C}}_w^{-1}\mathbf{s}(f_d,\theta) for each Doppler--angle hypothesis. The approach can yield 30+ dB of clutter suppression relative to a per-channel matched filter. The price: if the covariance matrix has dimension NpNaN_pN_a (pulses ×\times antennas), the required secondary-data support scales as O(NpNa)\mathcal{O}(N_pN_a) (Reed--Mallett--Brennan rule).

Practical Constraints
  • Secondary-data homogeneity assumption

  • Clutter-Doppler aliasing when PRF < clutter bandwidth

  • Computational budget: O((NpNa)3)\mathcal{O}((N_pN_a)^3) per CPI

📋 Ref: MIT Lincoln Laboratory STAP technical reports (Ward, 1994)

Why This Matters: Interference Whitening in MIMO Receivers

A 5G NR MIMO receiver sees a colored noise-plus-interference vector at its NrN_r antennas. The LMMSE-IRC (interference rejection combining) receiver estimates Cw+i\mathbf{C}_{w+i} from demodulation reference signals and applies exactly the Mahalanobis detector of Theorem TLRT in Colored Gaussian Noise to each resource element. The spatial covariance structure makes inter-cell interference predictable and defeats it with the same mathematics introduced in 1950s radar.

Whitening transform

A linear map that turns a coloured Gaussian vector with covariance C\mathbf{C} into an uncorrelated Gaussian vector with identity covariance. Canonically L1\mathbf{L}^{-1} where C=LLT\mathbf{C}=\mathbf{L}\mathbf{L}^{\mathsf{T}}.

Related: Cholesky factorisation, Mahalanobis distance

Mahalanobis distance

The distance uvC\|\mathbf{u}-\mathbf{v}\|_{\mathbf{C}} between two points measured in the metric induced by C1\mathbf{C}^{-1}. Equal to the Euclidean distance between their whitened images.

Related: Whitening Transform, Covariance matrix