Chapter Summary

Chapter Summary

Key Points

  • 1.

    The conditional expectation E[XY]\mathbb{E}[X|Y] is a random variable — a function of YY, not a number. Its key properties are linearity, the tower property (E[E[XY]]=E[X]\mathbb{E}[\mathbb{E}[X|Y]] = \mathbb{E}[X]), pulling out what is known, and invariance under independence.

  • 2.

    The MMSE estimator of XX given YY is the conditional expectation: X^MMSE=E[XY]\hat{X}_{\text{MMSE}} = \mathbb{E}[X|Y]. It minimizes the mean square error over all measurable functions of YY.

  • 3.

    The orthogonality principle states that the MMSE estimation error XE[XY]X - \mathbb{E}[X|Y] is orthogonal to every function of YY. Geometrically, E[XY]\mathbb{E}[X|Y] is the projection of XX onto the subspace of functions of YY.

  • 4.

    The LMMSE estimator restricts to affine functions and requires only first and second moments: X^LMMSE=μX+CXYCYY1(YμY)\hat{X}_{\text{LMMSE}} = \mu_X + \mathbf{C}_{XY}\mathbf{C}_{YY}^{-1}(\mathbf{Y} - \boldsymbol{\mu}_Y). For jointly Gaussian data, LMMSE equals MMSE.

  • 5.

    The law of total variance decomposes Var(X)=E[Var(XY)]+Var(E[XY])\text{Var}(X) = \mathbb{E}[\text{Var}(X|Y)] + \text{Var}(\mathbb{E}[X|Y]) into unexplained and explained components. The MMSE equals the average conditional variance E[Var(XY)]\mathbb{E}[\text{Var}(X|Y)].

Looking Ahead

Chapter 13 introduces stochastic processes — random functions of time. The conditional expectation and LMMSE tools from this chapter become the foundation for Wiener filtering (optimal linear prediction of stationary processes) and Kalman filtering (recursive state estimation for dynamical systems), treated in the FSI book.