Spectral Factorization and Innovations

Why Spectral Factorization?

The non-causal Wiener filter was easy because convolution over all of Z\mathbb{Z} turned into multiplication in the frequency domain. The causal Wiener filter is harder because the Wiener-Hopf equation now holds only for β„“β‰₯0\ell \geq 0, and a half-axis convolution equation has no one-line Fourier solution.

The trick β€” due to Wiener and Hopf themselves, refined by Kolmogorov β€” is to whiten the observation first. If we can write YnY_n as a causal LTI image of a white process JnJ_n, then causal estimation from YY becomes causal estimation from JJ, and because JJ is uncorrelated across time, causal estimation from JJ is trivial: just project on each JkJ_k independently. Spectral factorization is the tool that performs this whitening.

Definition:

Paley-Wiener Condition

A WSS process {Yn}\{Y_n\} with PSD Py(f)P_y(f) satisfies the Paley-Wiener condition if βˆ«βˆ’1/21/2log⁑Py(f) df>βˆ’βˆž.\int_{-1/2}^{1/2} \log P_y(f)\, df > -\infty. This is the integrability condition that log⁑Py\log P_y must satisfy for the causal factor to exist. It fails, for example, if Py(f)P_y(f) vanishes on an interval of positive measure.

Paley-Wiener is the condition that separates processes that can be generated as causal LTI outputs of a white noise driver (and hence causally predicted to positive accuracy) from those that cannot. Band-limited processes, for instance, are not Paley-Wiener.

Theorem: Spectral Factorization

Let {Yn}\{Y_n\} be WSS with PSD Py(f)P_y(f) satisfying the Paley-Wiener condition. Then there exist functions Py+(f)P_y^+(f) and Pyβˆ’(f)P_y^-(f) such that Py(f)=Py+(f) Pyβˆ’(f),Pyβˆ’(f)=(Py+(f))βˆ—,P_y(f) = P_y^+(f)\, P_y^-(f), \qquad P_y^-(f) = \big(P_y^+(f)\big)^*, where (i) Py+(f)P_y^+(f) is causal: its inverse DTFT p+[n]p^+[n] is supported on nβ‰₯0n \geq 0; (ii) Pyβˆ’(f)P_y^-(f) is anti-causal: its inverse DTFT is supported on n≀0n \leq 0; (iii) 1/Py+(f)1/P_y^+(f) is also causal, and 1/Pyβˆ’(f)1/P_y^-(f) is also anti-causal. We call Py+(f)P_y^+(f) the minimum-phase (or causal, or spectral) factor.

Think of Py+(f)P_y^+(f) as the frequency response of a stable, causal, minimum-phase filter whose squared magnitude is Py(f)P_y(f). Because both the filter and its inverse are causal and stable, passing the observation through 1/Py+(f)1/P_y^+(f) whitens it without losing any causal information: the filtering is invertible in real time.

,

Example: Spectral Factorization of the AR(1)+Noise PSD

For the observation Yn=Xn+ZnY_n = X_n + Z_n with XX an AR(1) process of coefficient aa and innovation variance Οƒu2\sigma_u^2, and ZZ white of variance Οƒz2\sigma_z^2, find the spectral factor Py+(f)P_y^+(f) explicitly.

Definition:

Innovations Process

Let {Yn}\{Y_n\} be WSS with Paley-Wiener-positive PSD and let Py+(f)P_y^+(f) be its minimum-phase spectral factor. The innovations process {Jn}\{J_n\} is the output of passing YnY_n through the whitening filter 1/Py+(f)1/P_y^+(f): Jn=βˆ‘kβ‰₯0w[k] Ynβˆ’k,whereΒ βˆ‘kw[k]eβˆ’j2Ο€fk=1Py+(f).J_n = \sum_{k \geq 0} w[k]\, Y_{n-k}, \qquad \text{where } \sum_k w[k] e^{-j 2\pi f k} = \frac{1}{P_y^+(f)}. The innovations satisfy (i) E[Jn]=0\mathbb{E}[J_n] = 0, (ii) E[JnJmβˆ—]=Ξ΄[nβˆ’m]\mathbb{E}[J_n J_m^*] = \delta[n-m] (white, unit variance), and (iii) span{Jk:k≀n}=span{Yk:k≀n}\text{span}\{J_k : k \leq n\} = \text{span}\{Y_k : k \leq n\} (causal information equivalence).

Because 1/Py+1/P_y^+ is causal and invertible, JnJ_n can be computed from YnY_n in real time, and vice versa. The innovations are the "fresh news" contained in each new observation β€” the part that could not be predicted from the past. This is the discrete-time analog of the continuous-time innovations representation of Wold (1938).

Pole-Zero Map of the Spectral Factors

For AR(1)+noise, plot the poles and zeros of Py+(z)P_y^+(z) (inside the unit circle: minimum phase) and Pyβˆ’(z)P_y^-(z) (outside: reciprocal partners). The reciprocal symmetry is the signature of spectral factorization. Vary aa and SNR to see the zero bb move.

Parameters
0.8
10

Spectral Factorization and the Causal Cone

Geometric unfolding of Py(f)=∣Py+(f)∣2P_y(f) = |P_y^+(f)|^2 via pole-zero placement. The minimum-phase factor Py+(f)P_y^+(f) collects all poles and zeros inside the unit circle; its inverse is the whitening filter that produces the innovations JnJ_n. This whitening is the key step that converts a colored-noise estimation problem into a white-noise one.

Historical Note: Wiener at MIT During the War

1940-1945

Norbert Wiener (1894-1964) derived what we now call the Wiener filter in 1942 as part of a classified report to the National Defense Research Committee, titled "Extrapolation, Interpolation, and Smoothing of Stationary Time Series." The motivating problem was anti-aircraft fire control: given a noisy radar track, predict where the aircraft would be when the shell arrived. Wiener's report circulated only among insiders β€” engineers called it "the yellow peril" because of its difficulty and its yellow binding β€” and was not openly published until 1949.

The filter was ahead of its time in several ways. It was one of the first systematic uses of second-order statistics in signal processing, and it introduced the spectral factorization machinery that would later become a central tool in control theory, filter design, and spectral estimation. Wiener, already famous for his work on Brownian motion and cybernetics, characteristically wrote the report in a dense style that required engineer Julian Bigelow to produce an accessible companion document for practitioners.

Historical Note: Kolmogorov's Independent Discovery

1941

Independently of Wiener and a year earlier, Andrey Kolmogorov (1903-1987) published "Stationary Sequences in Hilbert Space" in the Bulletin of Moscow State University in 1941. Kolmogorov worked in the discrete-time setting (which is where we are in this chapter) and approached the problem from pure Hilbert-space geometry: the optimal linear predictor of XnX_n from its past is the orthogonal projection onto the closed subspace generated by {Xm:m<n}\{X_m : m < n\}. He derived what we now call the Kolmogorov-Szego formula for the one-step prediction variance, Οƒp2=exp⁑∫log⁑Py(f) df\sigma_p^2 = \exp\int \log P_y(f)\, df, a result of remarkable elegance.

Kolmogorov's work reached the West only after the war. For a period in the 1950s there was a gentle priority dispute, but the dust settled on calling the continuous-time smoothing filter "Wiener" and the discrete-time predictor "Kolmogorov" or "Wiener-Kolmogorov." The fact that the same structure was discovered twice, on opposite sides of a world war, from a radar problem and from pure probability, is characteristic of deep mathematical ideas.

Common Mistake: Spectral Factorization Is Not Unique Without Causality

Mistake:

Writing Py(f)=∣G(f)∣2P_y(f) = |G(f)|^2 for any GG and calling GG the spectral factor.

Correction:

There are infinitely many ways to factor Py(f)P_y(f) as a squared magnitude: multiply any such G(f)G(f) by an all-pass filter A(f)A(f) with ∣A(f)∣=1|A(f)| = 1 and you get another factor. The unique choice is the minimum-phase factor Py+(f)P_y^+(f), which is the one whose inverse is also causal and stable. Causality pins down the factor up to a constant scalar.

Innovations

The white process JnJ_n obtained by passing an observation YnY_n through the causal whitening filter 1/Py+(f)1/P_y^+(f). The innovation JnJ_n represents the new information in YnY_n beyond what could be predicted from Ynβˆ’1,Ynβˆ’2,…Y_{n-1}, Y_{n-2}, \ldots.

Related: Minimum-Phase Spectral Factor, Whitening Filter

Minimum-Phase Spectral Factor

The function Py+(f)P_y^+(f) in the factorization Py(f)=Py+(f)Pyβˆ’(f)P_y(f) = P_y^+(f) P_y^-(f) whose inverse DTFT is supported on nβ‰₯0n \geq 0 (causal) and whose reciprocal 1/Py+(f)1/P_y^+(f) is also causal. For rational PSDs it corresponds to placing all poles and zeros inside the unit circle.

Related: Innovations, Paley-Wiener Condition

Paley-Wiener Condition

The requirement βˆ«βˆ’1/21/2log⁑Py(f) df>βˆ’βˆž\int_{-1/2}^{1/2} \log P_y(f)\,df > -\infty, ensuring that the PSD does not vanish on a set of positive measure. It is the necessary and sufficient condition for spectral factorization to exist and for the process to admit a causal white-noise representation.

Related: Minimum-Phase Spectral Factor

Whitening Filter

A causal filter whose output has unit-variance white autocorrelation when driven by a given colored input. For WSS YnY_n the whitening filter has frequency response 1/Py+(f)1/P_y^+(f).

Related: Innovations