Chapter Summary

Chapter Summary

Key Points

  • 1.

    AMP = ISTA + Onsager correction. The iteration x^t+1=η(AHrt+x^t;θt)\hat{\mathbf{x}}^{t+1} = \eta(\mathbf{A}^{H}\mathbf{r}^t+\hat{\mathbf{x}}^t;\theta_t), rt+1=yAx^t+1+δ1ηrt\mathbf{r}^{t+1} = \mathbf{y}-\mathbf{A}\hat{\mathbf{x}}^{t+1} + \delta^{-1}\langle\eta'\rangle\mathbf{r}^t differs from proximal gradient by a single additive term — the Onsager correction — that removes the bias introduced by feedback through AHA\mathbf{A}^{H}\mathbf{A}. It is the defining feature of AMP.

  • 2.

    Gaussianity of pseudo-data. In the large-NN limit with i.i.d. Gaussian A\mathbf{A}, the denoiser input AHrt+x^t\mathbf{A}^{H}\mathbf{r}^t+\hat{\mathbf{x}}^t is distributed as the true signal plus independent AWGN of variance τt2\tau_t^2. This justifies calling η\eta a denoiser and underpins all downstream analysis.

  • 3.

    State evolution. AMP's per-iteration MSE is governed by the scalar recursion τt+12=σ2+δ1E[(η(X+τtZ)X)2]\tau_{t+1}^2 = \sigma^2 + \delta^{-1}\mathbb{E}[(\eta(X+\tau_t Z)-X)^2]. Fixed points of this recursion determine the terminal MSE; their stability determines convergence. This one-dimensional description is the most striking analytical feature of AMP.

  • 4.

    Phase transitions and Bayes-optimality. Plotting SE fixed points over (δ,ρ)(\delta,\rho) reveals the Donoho--Tanner curve separating recovery from failure. With the MMSE denoiser matched to the prior, the AMP fixed point coincides with the replica-symmetric prediction of the Bayes MMSE — i.e., AMP is asymptotically Bayes-optimal.

  • 5.

    Denoiser design is the central knob. Soft-thresholding yields LASSO (with a calibration formula); posterior means yield Bayes-optimal estimation; learned neural denoisers yield D-AMP and deep unfolding. All fit in the same AMP scaffold provided the denoiser is Lipschitz and its divergence can be computed.

  • 6.

    AMP fails for structured matrices. The Onsager coefficient is calibrated to the Marchenko--Pastur spectrum. Sub-sampled DFT, Hadamard, ill-conditioned, or Kronecker-structured matrices break the calibration and produce divergence. Damping stabilises AMP but slows convergence and breaks state evolution; the principled fix is OAMP/VAMP (Chapter 21).

Looking Ahead

Chapter 21 generalises the AMP framework to right-rotationally-invariant sensing matrices via OAMP (orthogonal AMP) and VAMP (vector AMP). The key idea is to replace the simple transpose AH\mathbf{A}^{H} with an LMMSE estimator and enforce orthogonality between the linear and prior-denoising steps. This restores the Gaussianity-of-pseudo-data property for a much larger class of matrices — including structured sensing operators typical of communications and imaging — at the cost of an O(N3)O(N^3) inversion per iteration (amenable to Kronecker factorisation when the structure is known). Chapter 21 also introduces GAMP for generalised linear models (non-Gaussian likelihoods, e.g., 1-bit compressed sensing) and LAMP/LISTA for learned message passing via deep unfolding.