Chapter Summary
Chapter 6 Summary: Maximum Likelihood Estimation
Key Points
- 1.
The MLE maximizes the likelihood (or log-likelihood) of the observed data: . At interior maxima it solves the score equation .
- 2.
Under regularity, the MLE is consistent (), asymptotically normal (), and asymptotically efficient (achieves the CRLB).
- 3.
The invariance property: for any function , . This lets us transport MLEs across parameterizations without re-deriving.
- 4.
Support-dependent models (uniform on and similar) break regularity: the MLE converges at rate instead of , the limit is non-Gaussian, and the CRLB does not apply.
- 5.
For Gaussian linear models , the MLE is the closed-form weighted least squares and achieves the CRLB exactly.
- 6.
Newton-Raphson uses the observed Hessian and converges quadratically locally but can diverge; Fisher scoring replaces the Hessian by the FIM and is always positive-definite. The two coincide for exponential families.
- 7.
Periodogram = MLE of frequency for a single complex sinusoid in AWGN; matched filter peak = MLE of delay; DOA MLE reduces to MUSIC/ESPRIT in the multi-source high-SNR regime.
- 8.
Iterative ML in practice needs multi-start initialization for non-concave likelihoods, Armijo damping for Newton, and log-domain arithmetic for numerical stability.
Looking Ahead
Chapter 7 develops Bayesian estimation: when a prior is available, the MAP estimator replaces the MLE as the natural point estimate, and the MMSE/LMMSE estimators minimize expected squared error. Chapter 8 develops the EM algorithm for ML with latent variables, turning the intractable marginal likelihood of hidden-variable models into a sequence of tractable complete-data M-steps. The signal processing MLEs introduced here reappear throughout Part III (linear estimation, Kalman filtering, channel estimation) and Part IV (sparse recovery, massive random access).