Chapter Summary

Key Points

1.
The Bayesian framework treats $\theta$ as a random variable with a prior $f_\theta(\theta)$ . Bayes' rule produces the posterior $f_{\theta|Y}(\theta|y) \propto f_{Y|\theta}(y|\theta)\,f_\theta(\theta)$ , which summarizes all information about $\theta$ after observing $Y$ .
2.
The MAP estimator returns the posterior mode; the MMSE estimator returns the posterior mean. Under a flat prior, MAP reduces to the MLE. Under a Gaussian posterior, MAP and MMSE coincide.
3.
The MMSE estimator is the conditional mean: $\hat\theta_{\text{MMSE}}(\mathbf{Y}) = \mathbb{E}[\boldsymbol\theta|\mathbf{Y}]$ . This is proved via the orthogonality principle, which says the residual $\boldsymbol\theta - \hat\theta^\star(\mathbf{Y})$ is uncorrelated with every function of the observation.
4.
The LMMSE estimator, restricted to affine functions of $\mathbf{Y}$ , has closed form $\hat\theta_{\text{LMMSE}} = \mathbf{m}_\theta + \boldsymbol\Sigma_{\theta y}\boldsymbol\Sigma_y^{-1}(\mathbf{Y} - \mathbf{m}_y)$ , requiring only second-order statistics. Its error covariance is $\boldsymbol\Sigma_{\theta|y} = \boldsymbol\Sigma_\theta - \boldsymbol\Sigma_{\theta y}\boldsymbol\Sigma_y^{-1}\boldsymbol\Sigma_{y\theta}$ .
5.
For jointly Gaussian $(\boldsymbol\theta, \mathbf{Y})$ , the conditional mean is affine, so MMSE = LMMSE = MAP. This fact is responsible for the ubiquity of linear receivers in wireless communications.
6.
Pilot-based channel estimation illustrates the Bayesian advantage: the LS estimator is the BLUE when no prior is available, while the MMSE estimator strictly outperforms it whenever an informative channel covariance is known, with the largest gains at low SNR.
7.
Orthogonality is the unifying principle: whether proving MMSE = conditional mean, deriving the LMMSE normal equations, or verifying that the Kalman filter innovation is white, one repeatedly invokes "the residual is uncorrelated with what we've already used".

Looking Ahead

The next chapter develops the EM algorithm, which extends Bayesian estimation to problems where the posterior contains latent variables or mixture components and therefore no closed form exists. From Chapter 9 onward the LMMSE viewpoint is carried into the frequency domain (Wiener filters) and into recursive estimation (Kalman filters) — all of them orthogonality-principle calculations in disguise.

Application: Channel Estimation (LS vs. MMSE)Exercises