The Jointly Gaussian Case
When the Gaussian Shortcut Works
The LMMSE estimator is a second-order object: it requires only means and covariances, and gives up everything the higher-order structure of the distribution could teach us. One would expect a big gap between LMMSE and the true MMSE. For the Gaussian distribution — and essentially only for it — this gap vanishes. The reason is that the conditional mean of a jointly Gaussian pair is already an affine function of the conditioning variable. No further nonlinearity can possibly help.
Theorem: Conditional Distribution of a Gaussian Pair
Suppose with . Then the conditional distribution of given is Gaussian with Notably, the posterior covariance does not depend on the value .
Gaussian densities are invariant under affine transformations and under conditioning. The joint density is a quadratic form in ; completing the square in at fixed gives another Gaussian quadratic form — hence a Gaussian conditional density with affine mean.
Construct an uncorrelated pair
Let and define . Then .
Invoke Gaussian independence
is jointly Gaussian (an affine transformation of ) and uncorrelated, hence independent. This is a Gaussian-specific fact — uncorrelated non-Gaussian variables need not be independent.
Read off the conditional
Given , the vector is still by independence. Since , the conditional distribution of is obtained by shifting by the affine term, yielding the stated mean. Computing gives the conditional covariance.
Theorem: MMSE = LMMSE for Jointly Gaussian Pairs
Let be jointly Gaussian with . Then All three coincide with .
The conditional mean is affine (so MMSE ⊆ affine estimators ⇒ MMSE = LMMSE), and the conditional density is Gaussian and therefore unimodal with mode = mean (so MAP = MMSE). The affine conditional mean is a Gaussian signature.
MMSE is the conditional mean
Use the Gaussian conditional formula
By TConditional Distribution of a Gaussian Pair, , which is the LMMSE formula.
Identify MAP
Since the posterior is Gaussian — symmetric and unimodal — its mode equals its mean, so the MAP estimator equals the conditional mean as well.
Key Takeaway
"Linear estimators are optimal" is a Gaussian statement, not a universal one. Whenever you see a linear receiver claimed as "optimal", look for an explicit or implicit Gaussian assumption. Outside the Gaussian world, the MMSE estimator is nonlinear and the linear version is strictly suboptimal.
Example: Complex Gaussian Signal in Gaussian Noise
Let and with independent of . Compute and the error covariance.
Jointly Gaussian
is jointly proper complex Gaussian (an affine transformation of a Gaussian pair). By TMMSE = LMMSE for Jointly Gaussian Pairs (Hermitian version), the MMSE estimator coincides with the LMMSE.
Apply the LMMSE formula
and . Therefore
Error covariance
. Equivalently, by Woodbury, .
Joint and Conditional Gaussian Densities
The joint density of as a contour plot, with the conditional density overlaid on the vertical slice at . Vary the correlation coefficient between and to see how the conditional distribution narrows.
Parameters
Why This Matters: LMMSE Receivers in Wireless Systems
LMMSE receivers are the backbone of every modern wireless standard. In MIMO detection, the LMMSE detector trades a small amount of bias for a large reduction in noise enhancement compared with zero-forcing. In OFDM channel estimation, MMSE interpolation across pilot subcarriers using the channel's frequency correlation yields substantial gains over simple least squares. In the massive MIMO uplink, the LMMSE combiner is the capacity-achieving linear receiver when the user symbols are Gaussian. In every case the theoretical justification comes from TMMSE = LMMSE for Jointly Gaussian Pairs: under the Gaussian signaling assumptions, the LMMSE is the true MMSE.
See full treatment in Chapter 11
Historical Note: Wiener, Kolmogorov, and the Birth of LMMSE
1941–1950Norbert Wiener's classified MIT report Extrapolation, Interpolation, and Smoothing of Stationary Time Series (1942, declassified 1949) derived what we now call the LMMSE estimator in the infinite-dimensional setting of wide-sense stationary processes, motivated by the problem of predicting the trajectory of enemy aircraft for anti-aircraft fire control. Andrey Kolmogorov had obtained essentially the same result independently in 1941. The finite-dimensional matrix version presented here is the natural sampled-time specialization of the Wiener–Kolmogorov filter, and it underpins the Kalman filter (Chapter 10) as a recursive computation along the time axis.
Common Mistake: Marginal Gaussian ≠ Jointly Gaussian
Mistake:
"Both and are Gaussian, so the pair is jointly Gaussian and I can use ."
Correction:
It is perfectly possible to have and without the pair being jointly Gaussian. A classical counterexample: let and with equiprobable and independent of . Both marginals are standard Gaussian, but is not jointly Gaussian with . The MMSE estimator (by symmetry) differs wildly from the LMMSE (which is zero too here, but in general this is not guaranteed). Always check joint Gaussianity before identifying MMSE with LMMSE.
Quick Check
Under which of the following conditions is the LMMSE estimator necessarily equal to the MMSE estimator?
and are each Gaussian
is jointly Gaussian
is Gaussian and the channel is linear
is a linear function of plus noise of any distribution
Joint Gaussianity is exactly the condition under which the conditional mean is affine, making LMMSE = MMSE.