Prerequisites & Notation

Before You Begin

This chapter sits at the intersection of estimation theory, random matrix theory, and empirical Bayes. The classical CRLB-centred picture that dominates Chapters 4–7 is not wrong — it is simply incomplete. The point is that in modern applications (massive MIMO, compressed sensing, covariance estimation, high-dimensional inference) the dimension $N$ of the parameter vector grows in lockstep with the number of observations $M$ . Classical consistency arguments, which hold $N$ fixed and let $M\to\infty$ , collapse in this regime, and the behaviour of the MLE can swing from "efficient" to "spectacularly wrong". The reader should be comfortable with the following before continuing.

Maximum likelihood estimation and the Cramér–Rao lower bound(Review ch04)
Self-check: Can you state the CRLB for a vector parameter, write down the Fisher information matrix, and explain when the MLE achieves the bound?
Linear MMSE estimation and the Wiener filter(Review ch05)
Self-check: Can you derive the LMMSE estimator for the Gaussian linear model $\mathbf{y}=\mathbf{A}\mathbf{x}+\mathbf{w}$ and compute its MSE?
Compressed sensing and LASSO at a conceptual level(Review ch17)
Self-check: Can you state the $\ell_1$ -minimisation programme and recognise why an $\ell_1$ penalty promotes sparsity?
Random matrix theory essentials
Self-check: Do you know what the Marchenko–Pastur law says about the eigenvalues of $\frac{1}{M}\mathbf{A}^H\mathbf{A}$ when $\mathbf{A}$ has i.i.d. entries?
Convex optimisation (unconstrained and penalised)
Self-check: Can you recognise a convex problem, write the KKT conditions for a quadratic-plus- $\ell_1$ objective, and describe proximal-gradient iteration?
Bayesian decision theory(Review ch06)
Self-check: Can you compute a Bayes risk, define admissibility, and explain the relationship between a minimax estimator and a least-favourable prior?

Notation for This Chapter

Symbols used throughout Chapter 22. The ratio $\gamma=N/M$ is the single most important parameter — the entire chapter can be read as a study of how estimation behaves as this ratio departs from zero.

Symbol	Meaning	Introduced
$N$	Ambient dimension of the parameter vector $\mathbf{x}\in\mathbb{R}^N$	s01
$M$	Number of observations (rows of $\mathbf{A}$ / samples)	s01
$\gamma$	Aspect ratio $\gamma = N/M$ ; the proportional-asymptotics regime fixes $\gamma\in(0,\infty)$	s01
$\mathbf{A}$	Design / sensing / measurement matrix in $\mathbb{R}^{M\times N}$	s01
$\lambda$	Regularization parameter (ridge / LASSO penalty)	s02
$\hat{\mathbf{x}}_{\text{ridge}}(\lambda)$	Ridge estimator $(\mathbf{A}^T\mathbf{A}+\lambda\mathbf{I})^{-1}\mathbf{A}^T\mathbf{y}$	s02
$\hat{\mathbf{x}}_{\text{LASSO}}(\lambda)$	LASSO estimator $\arg\min\tfrac12\\|\mathbf{y}-\mathbf{A}\mathbf{x}\\|^2+\lambda\\|\mathbf{x}\\|_1$	s02
$\hat{\mathbf{x}}_{\text{JS}}$	James–Stein estimator	s03
$R(\hat\theta,\theta)$	Frequentist risk $\mathbb{E}_\theta[\\|\hat\theta-\theta\\|^2]$	s03
$r_*^{\text{mm}}$	Minimax risk $\inf_{\hat\theta}\sup_\theta R(\hat\theta,\theta)$ over a parameter class	s04
$s$	Sparsity level — number of non-zero components of $\mathbf{x}$	s04
$\pi^*$	Least-favourable prior attaining the minimax risk	s04

← Ch 21 The Blessing and Curse of High Dimensions