Chapter Summary

Key Points

1.
Bias-variance identity. For any estimator of a scalar $\theta$ , $\mathrm{MSE}_\theta(\hat{\theta}) = b(\theta)^2 + \text{Var}_\theta(\hat{\theta})$ . Tuning an estimator is a trade between these two terms; biased estimators can beat unbiased ones on MSE.
2.
Fisher information. Under regularity, $J(\theta) = \text{Var}_\theta(\partial_\theta \log f_\theta(\mathbf{Y})) = -\mathbb{E}_\theta[\partial^2_\theta \log f_\theta(\mathbf{Y})]$ . For independent observations it is additive; for i.i.d. samples, $J(\theta) = n\,J_{1}(\theta)$ .
3.
CRB (scalar). Any unbiased estimator satisfies $\text{Var}_\theta(\hat{\theta}(\mathbf{Y})) \geq 1/J(\theta)$ . The proof is Cauchy--Schwarz on the centered estimator and the score; equality (efficiency) holds iff the score is affine in $\hat{\theta}$ .
4.
CRB (vector). $\text{Cov}_{\boldsymbol{\theta}}(\hat{\boldsymbol{\theta}}) \succeq \mathbf{J}(\boldsymbol{\theta})^{-1}$ . The componentwise bound $[\mathbf{J}^{-1}]_{ii}$ is generally larger than $1/[\mathbf{J}]_{ii}$ , the gap measuring the price of joint estimation.
5.
Fisher--Neyman factorization. $T(\mathbf{Y})$ is sufficient iff $f_\theta(\mathbf{y}) = g_\theta(T(\mathbf{y}))\, h(\mathbf{y})$ . In practice you identify the $\theta$ -dependence in the likelihood and read off $T$ . The exponential family $f_\theta(\mathbf{y}) = h(\mathbf{y}) \exp(\eta(\theta)^T T(\mathbf{y}) - A(\theta))$ makes $T(\mathbf{y})$ automatically sufficient --- and, when the natural parameter image has full dimension, complete.
6.
Rao--Blackwell. Conditioning any unbiased estimator on a sufficient statistic produces an unbiased estimator with no-larger variance. It is a statistical $L^2$ projection: equality holds iff the original estimator was already a function of the sufficient statistic.
7.
Lehmann--Scheffe. When $T$ is a complete sufficient statistic, any unbiased function of $T$ is the unique MVUE. This gives a constructive MVUE recipe: find $T$ complete, find any unbiased function of $T$ , done. Efficiency $\Rightarrow$ MVUE, but not conversely (e.g., the Bessel-corrected sample variance is MVUE but not efficient).
8.
Engineering relevance. The matched filter is a sufficient statistic. Pilot SNR directly controls the CRB on channel estimates. In ISAC, the CRB on target parameters defines one side of the sensing--communication Pareto frontier.

Looking Ahead

Chapter 6 turns our attention from the benchmark to a general-purpose procedure: maximum likelihood. We will prove that the MLE is asymptotically unbiased, consistent, and efficient --- so it reaches the CRB in the limit of large data --- and work through its closed-form solutions (Gaussian linear model) and iterative ones (Newton--Raphson, Fisher scoring). The CRB we built here is the yardstick against which every MLE derivation in Chapter 6 will be measured.

Rao--Blackwell, Lehmann--Scheffe, and the MVUE Exercises