The Wiener-Hopf Equation and the Orthogonality Principle
Why the Wiener Filter Still Matters
The Wiener filter was designed in the early 1940s as a solution to the anti-aircraft fire-control problem: given a noisy radar track of an incoming aircraft, produce the best linear estimate of its future position so that the shell and the aircraft arrive at the same point at the same time. Eighty years later, every channel equalizer in a modern wireless modem, every beamformer in a microphone array, every noise-cancelling headphone, and the steady-state limit of every Kalman filter are, at their core, Wiener filters. The filter is a working engineer's most faithful companion.
The operational question is always the same. We observe a signal that carries information about a desired quantity , corrupted by noise and by the channel. We want to produce the best linear estimate of from the observation sequence. "Best" means minimum mean-square error (MMSE), and "linear" means we restrict ourselves to convolution with a filter. Two questions then arise: what filter coefficients should we use, and do we allow the filter to look into the future of the observation (non-causal) or only into its past (causal)?
This chapter answers both questions. The answers are classical, but the geometry behind them β orthogonality of the error to the observations, spectral factorization as a Cholesky decomposition in the frequency domain, innovations as a whitened version of the observation β is what you will carry into every later chapter of this book.
Definition: Jointly WSS Processes
Jointly WSS Processes
Two zero-mean discrete-time random processes and are jointly wide-sense stationary (jointly WSS) if (i) each is individually WSS, with autocorrelations and depending only on the lag , and (ii) the cross-correlation depends only on and not on .
The power spectral densities and cross-PSD are the discrete-time Fourier transforms of the correlation sequences: for .
By the Wiener-Khinchin theorem, and are real and non-negative. The cross-PSD is in general complex; it satisfies the Hermitian relation .
Definition: The Wiener Filtering Problem
The Wiener Filtering Problem
Let be jointly WSS with known second-order statistics . Produce a linear estimate that minimizes the mean-square error over all choices of the filter coefficients . Three canonical choices for the index set are:
| Name | Uses | |
|---|---|---|
| (all lags) | Non-causal Wiener filter | Smoothing, off-line processing |
| (non-negative lags) | Causal Wiener filter | Real-time filtering |
| (positive lags) | Wiener predictor | Forecasting the future |
The three problems share a single geometric structure: project onto the closed linear span of in the Hilbert space of zero-mean finite-variance random variables with inner product .
Theorem: Orthogonality Principle (Wiener-Hopf Equations)
Let be a linear estimator of from . Then is the MMSE estimator over the class of filters with support if and only if the error is orthogonal to every observation used in the estimate: Equivalently, the filter coefficients satisfy the Wiener-Hopf normal equations:
Think of as the orthogonal projection of onto the subspace spanned by the observations . The projection is the point in the subspace closest to , and the residual β the error β is orthogonal to the subspace. This is the same picture as the least-squares normal equation in linear algebra, translated to the infinite-dimensional setting of random processes.
Necessity (orthogonality $\Rightarrow$ optimality)
Suppose satisfies the orthogonality condition. Let be any other filter with support and write the two estimators as and . The error of is where is a linear combination of observations in . By the orthogonality assumption, . Hence with equality iff almost surely, i.e., . So achieves the minimum.
Sufficiency (optimality $\Rightarrow$ orthogonality)
Conversely, if is MMSE-optimal, then for every and every the perturbed filter produces an error with larger or equal MSE. Writing the MSE as a function of and setting the derivative at to zero yields .
Wiener-Hopf equations
Substituting into and using joint WSS gives which is the stated normal equation.
The Wiener-Hopf Equation as a Toeplitz System
When (finite-length FIR Wiener filter), the Wiener-Hopf equations become a finite linear system: , where is a Hermitian Toeplitz matrix with entries . Solving this system in the finite case is a standard linear algebra exercise. The point is that as , Toeplitz matrices are asymptotically diagonalized by the Fourier basis (this is Szego's theorem), and the infinite Wiener-Hopf equation admits a closed-form frequency-domain solution. The whole architecture of Section 9.2 rests on this passage from finite Toeplitz systems to their frequency-domain limit.
Theorem: MMSE of the Wiener Estimator
If satisfies the Wiener-Hopf equations, then the MMSE is
The MMSE is the variance of minus the variance of the estimate. The cross term is the inner product of the filter with the cross-correlation β a measure of how much information the observations carry about the signal.
Expand the MSE
.
The second term vanishes by orthogonality
since each inner term is zero by the orthogonality principle.
Simplify the first term
.
Example: A 2-Tap FIR Wiener Filter
Let be a zero-mean WSS process with , , and let where is white noise with variance independent of . Design the optimal 2-tap causal FIR Wiener filter .
Compute the required correlations
Since and are independent and is white: , . Also , so and .
Form the Wiener-Hopf system
The normal equations for read
Solve
The determinant is . Inverting gives , .
Compute the MMSE
By TMMSE of the Wiener Estimator: For comparison, using just (a 1-tap filter) gives and MSE . Adding the second tap reduces the MSE by about 5%.
Common Mistake: Orthogonality Is to the Observations, Not to the Signal
Mistake:
Students sometimes write , confusing the orthogonality principle with a statement about the signal.
Correction:
The error is orthogonal to every observation used in the estimate, never to the target . In fact in general β this is precisely the MMSE (see the proof of TMMSE of the Wiener Estimator).
Wiener-Hopf Equation
The linear system , , whose solution is the MMSE Wiener filter. When it becomes a convolution equation solvable in the frequency domain; when it requires spectral factorization.
Related: Orthogonality Principle, Innovations