Chapter Summary

Chapter Summary

Key Points

  • 1.

    Proportional asymptotics is the correct regime. When NN and MM are both large but comparable (γ=N/M=Θ(1)\gamma=N/M=\Theta(1)), classical consistency results break down. The Marchenko–Pastur law describes the limiting eigenvalue distribution of 1MATA\tfrac1M\mathbf{A}^T\mathbf{A} and drives every risk calculation in the chapter.

  • 2.

    OLS risk blows up. In the proportional regime, the per-coordinate OLS risk is γσ2/(1γ)\gamma\sigma^2/(1-\gamma) and diverges as γ1\gamma\to 1. The MLE fails not because it is a bad estimator but because the problem becomes ill-conditioned.

  • 3.

    Ridge has a closed-form optimal regularization. Under a Gaussian prior, optimal ridge is λ=1/SNR\lambda^*=1/\text{SNR} — a universally applicable heuristic that coincides with LMMSE. Ridge is finite even for γ1\gamma\geq 1 where OLS is undefined.

  • 4.

    LASSO promotes sparsity and is convex. The 1\ell_1 penalty produces sparse solutions by geometric virtue of the diamond-shaped sub-level sets. ISTA/FISTA/AMP solve it efficiently.

  • 5.

    James–Stein dominates the MLE for N3N\geq 3. Shrinkage toward any fixed anchor reduces risk uniformly — a purely frequentist guarantee that requires no prior. The empirical-Bayes interpretation makes the shrinkage rule concrete: learn the prior variance from the data and shrink accordingly.

  • 6.

    Minimax rates characterise sample complexity. For ss-sparse signals the minimax rate is Θ(σ2slog(N/s)/M)\Theta(\sigma^2 s\log(N/s)/M) — the information-theoretic floor for sparse estimation. The log(N/s)\log(N/s) factor is the price of not knowing the support.

  • 7.

    All of this is convex. Ridge, LASSO, elastic net, and the Bayes estimators under log-concave priors are convex programmes. The convexity reflex — flag convex problems immediately — applies throughout.

Looking Ahead

Chapter 23 replaces the Gaussian noise assumption with robust and non-parametric alternatives (Huber, RKHS, Gaussian processes, deep learning). Chapter 24 completes the estimation-theoretic picture with the Van Trees and Ziv–Zakai bounds, and the MMSE–mutual-information identity that connects estimation to information theory. The high-dimensional machinery built here underpins every realistic estimation problem in modern wireless systems, from massive-MIMO channel estimation to compressed-sensing radar to grant-free mMTC.