Ferkans — Interactive Telecom Tutor

ex-ch20-01

Easy

Write down the AMP iteration for soft-thresholding with threshold $\lambda_t$ and $\delta = M/N$ . Identify the Onsager term explicitly.

Show Hint

Start from $\hat{\mathbf{x}}^{t+1} = \eta(\mathbf{A}^{H}\mathbf{r}^t + \hat{\mathbf{x}}^t; \lambda_t)$ .

For soft-thresholding, $\eta'(u;\lambda) = \mathbf{1}\{|u|>\lambda\}$ , so $\langle\eta'\rangle$ is the fraction of non-zero components of $\hat{\mathbf{x}}^{t+1}$ .

Solution

Full iteration

$\hat{\mathbf{x}}^{t+1} = \eta_{\mathrm{st}}(\mathbf{A}^{H}\mathbf{r}^t+\hat{\mathbf{x}}^t;\lambda_t)KATEXPLACEHOLDER0END\mathbf{r}^{t+1} = \mathbf{y} - \mathbf{A}\hat{\mathbf{x}}^{t+1} + \frac{1}{\delta}\cdot\frac{\|\hat{\mathbf{x}}^{t+1}\|_0}{N}\cdot\mathbf{r}^t.$ $The term$ \delta^{-1}|\hat{\mathbf{x}}^{t+1}|_0/N\cdot\mathbf{r}^t$ is the Onsager correction.

ex-ch20-02

Easy

Show that the derivative of the soft-threshold denoiser is the indicator of being above threshold: $\eta_{\mathrm{st}}'(u;\lambda) = \mathbf{1}\{|u|>\lambda\}$ almost everywhere. Where is $\eta_{\mathrm{st}}$ non-differentiable, and what does the SE analysis require about this?

Show Hint

Differentiate each piece of the piecewise-linear definition.

Non-differentiable points occur where $|u|=\lambda$ exactly.

Solution

Piecewise derivative

$\eta_{\mathrm{st}}(u;\lambda) = u-\lambda$ for $u>\lambda$ (derivative $1$ ); $=0$ for $|u|\le\lambda$ (derivative $0$ ); $=u+\lambda$ for $u<-\lambda$ (derivative $1$ ). Thus $\eta'_{\mathrm{st}}(u;\lambda)=\mathbf{1}\{|u|>\lambda\}$ for $u \ne \pm\lambda$ .

SE requirement

SE requires $\eta$ Lipschitz, not differentiable everywhere. Soft- thresholding is 1-Lipschitz, so SE applies. The measure-zero set $\{u = \pm\lambda\}$ does not affect the expectation $\mathbb{E}[\eta'(X+\tau Z;\lambda)]$ because $Z$ has a continuous density.

ex-ch20-03

Easy

For a Bernoulli--Gaussian prior $X \sim (1-\rho)\delta_0 + \rho\,\mathcal{N}(0,1)$ , write the state-evolution recursion for AMP with soft-thresholding.

Show Hint

$\mathrm{MSE}(\tau^2;\lambda) = \mathbb{E}[(\eta_{\mathrm{st}}(X+\tau Z;\lambda)-X)^2]$ .

Split the expectation over $X=0$ (probability $1-\rho$ ) and $X \sim \mathcal{N}(0,1)$ (probability $\rho$ ).

Solution

SE recursion

$\tau_{t+1}^2 = \sigma^2 + \frac{1}{\delta}\!\left[(1-\rho)\,\mathbb{E}[\eta_{\mathrm{st}}(\tau_t Z;\lambda_t)^2] + \rho\,\mathbb{E}[(\eta_{\mathrm{st}}(X+\tau_t Z;\lambda_t)-X)^2]\right]$ $with$ X \sim \mathcal{N}(0,1) $,$ Z \sim \mathcal{N}(0,1)$ independent.

Closed-form MSE

Each expectation reduces to Gaussian integrals involving the standard normal CDF $\Phi$ and PDF $\varphi$ . The non-zero term gives, with $s^2=1+\tau_t^2$ and $\lambda=\lambda_t$ , $\mathbb{E}[(\eta_{\mathrm{st}}(X+\tau_t Z;\lambda)-X)^2] = (s^2)(1-2\Phi(\lambda/s)+...) + \tau_t^2\cdot(\text{soft-threshold noise term})$ , which can be fully expanded using standard Gaussian identities (see Donoho--Maleki--Montanari 2011).

ex-ch20-04

Easy

Verify that in the noiseless case $\sigma^2=0$ , the SE recursion for L1-AMP takes the form $\tau_{t+1}^2 = \delta^{-1} M(\rho,\alpha)\,\tau_t^2$ when $\lambda_t = \alpha \tau_t$ , for some function $M$ independent of $\tau$ . Under what condition does $\tau_t \to 0$ ?

Show Hint

Change variables: let $s = u/\tau$ so that $X+\tau Z = \tau\cdot(X/\tau + Z)$ .

The scaling $\lambda = \alpha\tau$ is crucial for making the recursion homogeneous.

Solution

Scaling

With $\lambda=\alpha\tau$ the MSE of soft-thresholding scales as $\tau^2$ because both the input noise and the threshold scale identically. Precisely, $\mathrm{MSE}(\tau^2;\alpha\tau) = \tau^2 M(\rho,\alpha)$ .

Convergence condition

The recursion $\tau_{t+1}^2 = \delta^{-1}M(\rho,\alpha)\tau_t^2$ is geometric with ratio $q = M(\rho,\alpha)/\delta$ . It converges to zero iff $q<1$ , i.e., $M(\rho,\alpha)<\delta$ . Optimising $\alpha$ gives $\rho$ below the Donoho--Tanner curve as the recovery region.

ex-ch20-05

Medium

Derive the MMSE denoiser for a Bernoulli--Gaussian prior $X \sim (1-\rho)\delta_0 + \rho\,\mathcal{N}(0,\sigma_x^2)$ observed in Gaussian noise of variance $\tau^2$ . Show that $\eta_{\mathrm{mmse}}$ is smooth (in contrast to the soft-threshold) and compute its derivative.

Show Hint

Apply Bayes' rule to obtain the posterior weight $\pi(u) = P(X \ne 0|U=u)$ .

Conditional on $X\ne 0$ , $(X|U=u) \sim \mathcal{N}\!\left(\frac{\sigma_x^2}{\sigma_x^2+\tau^2}u,\frac{\sigma_x^2\tau^2}{\sigma_x^2+\tau^2}\right)$ .

Solution

Posterior weight

$\pi(u) = \frac{\rho\,\mathcal{N}(u;0,\sigma_x^2+\tau^2)}{(1-\rho)\mathcal{N}(u;0,\tau^2)+\rho\,\mathcal{N}(u;0,\sigma_x^2+\tau^2)}.$ $

Posterior mean

$\eta_{\mathrm{mmse}}(u;\tau^2) = \pi(u)\cdot\frac{\sigma_x^2}{\sigma_x^2+\tau^2}\,u.$ $

Derivative

By the Stein identity, $\eta'_{\mathrm{mmse}}(u;\tau^2) = \mathrm{Var}(X|U=u)/\tau^2$ . The posterior variance is $\pi(u)\cdot\frac{\sigma_x^2\tau^2}{\sigma_x^2+\tau^2} + \pi(u)(1-\pi(u))\!\left(\frac{\sigma_x^2 u}{\sigma_x^2+\tau^2}\right)^2$ , giving a smooth, bounded $\eta'_{\mathrm{mmse}}$ .

ex-ch20-06

Medium

Prove the Stein identity: for a smooth bounded $\eta$ and $U = X + \tau Z$ with $Z \sim \mathcal{N}(0,1)$ independent of $X$ , $\mathrm{Cov}(\eta(U), U - \mathbb{E}[U]) = \tau^2 \mathbb{E}[\eta'(U)].$ (This is the identity that connects divergence of the MMSE denoiser to its posterior variance.)

Show Hint

Use integration by parts, treating $U - \mathbb{E}[U]$ as the derivative of a Gaussian density.

For a standard Gaussian $Z$ , $\mathbb{E}[\phi(Z)Z] = \mathbb{E}[\phi'(Z)]$ (Stein's lemma).

Solution

Reduce to Stein's lemma

Write $U = X + \tau Z$ , so conditional on $X$ , $U|X \sim \mathcal{N}(X,\tau^2)$ . $\mathbb{E}[\eta(U)(U-X)|X] = \tau^2 \mathbb{E}[\eta'(U)|X]$ by Stein's lemma applied to the conditional Gaussian.

Unconditional form

Take the expectation over $X$ : $\mathbb{E}[\eta(U)(U-X)] = \tau^2 \mathbb{E}[\eta'(U)]$ . Since $U - \mathbb{E}[U] = (U-X) + (X - \mathbb{E}[X])$ and a direct computation shows the cross terms equal $\mathrm{Cov}(\eta(U),X)$ , rearrangement yields the claimed identity.

ex-ch20-07

Medium

Implement AMP with soft-thresholding in 20 lines of NumPy. For $(N,M)=(500,250)$ , a sparse Bernoulli--Gaussian signal with $\rho=0.1$ , $\sigma^2=10^{-4}$ , and $\lambda_t = 1.5\hat{\tau}_t$ where $\hat{\tau}_t^2 = \|\mathbf{r}^t\|^2/M$ , run 20 iterations and report the normalised MSE.

Show Hint

Use $\mathcal{N}(0,1/M)$ for $\mathbf{A}$ entries.

Initialise $\hat{\mathbf{x}}^0=\mathbf{0}$ , $\mathbf{r}^0=\mathbf{y}$ .

Don't forget the Onsager term — use np.mean(np.abs(u) > lam) as the empirical divergence.

Solution

Core loop

import numpy as np
rng = np.random.default_rng(0)
N, M, rho, sigma = 500, 250, 0.1, 1e-2
A = rng.normal(0, 1/np.sqrt(M), size=(M, N))
x = rng.normal(size=N) * (rng.uniform(size=N) < rho)
y = A @ x + sigma * rng.normal(size=M)
xh = np.zeros(N); r = y.copy(); rprev = np.zeros(M)
for t in range(20):
    tau = np.linalg.norm(r)/np.sqrt(M)
    lam = 1.5 * tau
    u = A.T @ r + xh
    xh = np.sign(u) * np.maximum(np.abs(u) - lam, 0)
    b = np.mean(np.abs(u) > lam) * (N/M)
    r = y - A @ xh + b * r
mse = np.linalg.norm(xh - x)**2 / np.linalg.norm(x)**2

Expected output

Typical run produces NMSE around $10^{-3}$ in 10--15 iterations. If the Onsager term is removed (set $b=0$ ), MSE plateaus at a much larger value — the ISTA residual.

ex-ch20-08

Medium

Explain the calibration equation $\lambda_{\mathrm{eff}} = \lambda_\star(1 - \delta^{-1}\langle\eta'\rangle_\star)$ linking L1-AMP's threshold to the effective LASSO regulariser. What does it mean for the effective penalty to be smaller than the AMP threshold?

Show Hint

Start from the AMP fixed-point equations and compare to LASSO's KKT conditions.

Note that $\delta^{-1}\langle\eta'\rangle_\star$ is a fraction in $[0,1]$ when $\delta \ge \langle\eta'\rangle_\star$ .

Solution

Fixed-point equation

At an AMP fixed point: $\mathbf{r}^\infty(1-\delta^{-1}\langle\eta'\rangle_\star) = \mathbf{y}-\mathbf{A}\hat{\mathbf{x}}^\infty$ . Denote the scaling factor by $c = 1-\delta^{-1}\langle\eta'\rangle_\star$ .

Matching LASSO KKT

LASSO KKT: $\mathbf{A}^{H}(\mathbf{y}-\mathbf{A}\hat{\mathbf{x}}) \in \lambda_{\mathrm{eff}}\partial\|\hat{\mathbf{x}}\|_1$ . Substituting AMP's fixed-point gives $c\cdot\mathbf{A}^{H}\mathbf{r}^\infty \in \lambda_{\mathrm{eff}}\partial\|\hat{\mathbf{x}}^\infty\|_1$ , while AMP says $\lambda_\star \partial\|\hat{\mathbf{x}}^\infty\|_1 \ni \mathbf{A}^{H}\mathbf{r}^\infty$ . Comparing: $\lambda_{\mathrm{eff}} = c\lambda_\star$ .

Interpretation

The effective LASSO penalty is a fraction of the AMP threshold because the Onsager term absorbs part of the residual. Practically: if you want LASSO at penalty $\lambda_{\mathrm{eff}}$ , set AMP's threshold to $\lambda_\star = \lambda_{\mathrm{eff}}/c$ , where $c$ is recomputed from the current Onsager coefficient.

ex-ch20-09

Medium

Let $\Psi(\tau^2) = \sigma^2 + \delta^{-1}\mathbb{E}[(\eta(X+\tau Z;\theta)-X)^2]$ . Show that $\Psi$ is monotone increasing and continuous in $\tau^2$ . Conclude that the AMP recursion $\tau_{t+1}^2 = \Psi(\tau_t^2)$ is monotone in $t$ .

Show Hint

MSE of any fixed denoiser is monotone in the input noise variance.

Monotone bounded sequences converge.

Solution

Monotonicity of scalar MSE

For a Lipschitz denoiser, the input distribution $X+\tau Z$ becomes more spread-out as $\tau$ increases, so the best achievable MSE cannot decrease. A formal proof couples two copies $\tau_1 < \tau_2$ and uses the data-processing inequality.

Monotone iteration

Since $\Psi$ is monotone non-decreasing, $\tau_{t+1}^2 = \Psi(\tau_t^2)$ is a monotone dynamical system: either $\tau_1^2 \ge \tau_0^2$ (and the sequence is non-decreasing) or $\tau_1^2 \le \tau_0^2$ (and the sequence is non-increasing). In either case, by monotone-convergence, $\tau_t^2$ converges to a fixed point.

ex-ch20-10

Medium

Construct a sensing matrix $\mathbf{A}$ that makes AMP diverge and provide a numerical demonstration. Use a sub-sampled DFT matrix with $N=256$ , $M=128$ .

Show Hint

Take $\mathbf{A} = \sqrt{N/M}\cdot P\cdot F$ where $F$ is the $N\times N$ DFT and $P$ is a random row subsampling.

Run AMP for 30 iterations, plot $\|\mathbf{r}^t\|$ against $t$ .

Solution

Setup

F = np.fft.fft(np.eye(N)) / np.sqrt(N)
idx = rng.choice(N, M, replace=False)
A = np.sqrt(N/M) * F[idx]  # random rows of DFT

Apply the same AMP loop as in Exercise 7.

Observation

The residual norm typically grows super-linearly after a few iterations and overflows. Damping with $\theta \approx 0.3$ stabilises the iteration but the MSE trajectory no longer matches the SE prediction.

ex-ch20-11

Medium

Compute (numerically) the Donoho--Tanner phase-transition curve $\rho^\star(\delta)$ by scanning the plane $(\delta,\rho)\in(0,1)^2$ and checking whether the L1-AMP SE recursion with optimal $\alpha$ converges to zero.

Show Hint

For each $(\delta,\rho)$ , sweep $\alpha \in [1.0, 3.0]$ and find the one minimising $M(\rho,\alpha)$ .

The boundary is where $\min_\alpha M(\rho,\alpha) = \delta$ .

Solution

SE slope computation

For each $(\rho,\alpha)$ , compute $M(\rho,\alpha)$ numerically by Monte Carlo or Gaussian quadrature. Minimise over $\alpha$ . The curve $\rho^\star(\delta)$ is the largest $\rho$ such that the minimum equals $\delta$ .

Expected curve

The resulting curve is monotone, concave, starts at $\rho^\star(0)=0$ and approaches $\rho^\star(1) = 1$ . For $\delta=0.5$ one finds $\rho^\star \approx 0.19$ , in agreement with Donoho--Tanner combinatorial-geometry predictions.

ex-ch20-12

Hard

Derive Taylor-expansion cancellation that gives the Onsager term. Concretely, show that without the Onsager term the conditional distribution of $\mathbf{A}^{H}\mathbf{r}^t+\hat{\mathbf{x}}^t$ given the history has a bias proportional to $\mathbf{r}^{t-1}$ , and compute the proportionality constant.

Show Hint

Use the cavity decomposition $\mathbf{A} = \mathbf{A}_{\setminus i} + \mathbf{a}_i \mathbf{e}_i^T$ .

Expand $\hat{x}_i^{t+1} = \eta(u_i^t)$ around $u_i^t = (\mathbf{A}^{H}_{\setminus i}\mathbf{r}^t + \hat{\mathbf{x}}^t)_i + \mathbf{a}_i^H \mathbf{r}^t$ .

Solution

Cavity decomposition

Isolate column $i$ : $\mathbf{A} = \mathbf{A}_{-i} + \mathbf{a}_i \mathbf{e}_i^T$ . The pseudo-data at index $i$ is $u_i = (\mathbf{A}^{H}\mathbf{r})_i + \hat{x}_i = \mathbf{a}_i^H \mathbf{r} + \hat{x}_i$ . Linearising $\mathbf{r} = \mathbf{r}_{-i} + \mathbf{a}_i(\hat{x}_i - \hat{x}_i^{\mathrm{cavity}})$ introduces a self-feedback term of size $\mathbf{a}_i^H \mathbf{a}_i \approx 1/\delta$ .

Correction factor

Tracking the Taylor expansion to first order, one finds that the bias in the denoiser input due to self-feedback is $\delta^{-1}\eta'(u_i^{t-1})\,r_i^{t-1}$ at each coordinate. Aggregating: the residual at step $t$ should have added $\delta^{-1}\langle\eta'\rangle\,\mathbf{r}^{t-1}$ to cancel this bias — precisely the Onsager term.

ex-ch20-13

Hard

Show that damped AMP with parameter $\theta \in (0, 1]$ can be viewed as a single AMP step with an augmented denoiser. Use this to explain qualitatively why damping breaks state evolution (except in the limit $\theta \to 0$ ).

Show Hint

Define an effective denoiser $\tilde{\eta}(u) = (1-\theta)\hat{x} + \theta\eta(u)$ that depends on the previous estimate.

Recall that SE requires the denoiser to act coordinate-wise on pseudo-data that is i.i.d. across iterations.

Solution

Rewriting as augmented denoiser

The damped update $\hat{x}^{t+1} = (1-\theta)\hat{x}^t + \theta\eta(u^t)$ is an affine combination of the old iterate and the raw denoiser output. If we include $\hat{x}^t$ as a side input to the denoiser, the iteration looks like a single AMP step with a history-dependent $\tilde\eta$ .

Breakdown of SE

SE demands that the denoiser output at iteration $t$ depend only on $u^t$ (a Gaussian conditionally on history). Damped AMP's effective denoiser explicitly depends on $\hat{x}^t$ , introducing correlation across iterations that changes the covariance structure of the pseudo-data and breaks the scalar recursion.

Limit $\theta\to 0$

In the continuous-time limit (many damped steps per ``true'' step) the iterate traces a gradient flow on the AMP fixed-point equation. SE re-emerges in this limit but with a different time-scaling.

ex-ch20-14

Hard

For a non-negative sparse signal with prior $(1-\rho)\delta_0 + \rho\cdot\mathrm{Exp}(1)$ , design a denoiser that is Bayes-optimal and verify via simulation that the AMP MSE converges to the state-evolution fixed point predicted by this denoiser.

Show Hint

Use the posterior: $\mathbb{E}[X|U=u]$ for $X\ge 0$ truncated-Gaussian-Exponential mixture.

The resulting denoiser is non-negative, monotone, and smooth on $\mathbb{R}$ .

Solution

Posterior

Compute $p(x|u) \propto p_X(x)\mathcal{N}(u;x,\tau^2)$ for $x \ge 0$ : the posterior is a mixture of a point mass at $0$ and a truncated Gaussian. The posterior mean is available in closed form via the error function.

Implementation

Evaluate the posterior mean in closed form; its derivative is (posterior variance)/ $\tau^2$ by the Stein identity. Replace the soft-threshold in the AMP loop with this denoiser and compute the Onsager coefficient as the empirical mean of the derivative.

Empirical check

The MSE trajectory should closely match the scalar $\tau_{t+1}^2 = \sigma^2 + \delta^{-1}\mathbb{E}[(\eta_{\mathrm{mmse}}(X+\tau_t Z)-X)^2]$ . For $\rho=0.1$ , $\delta=0.5$ , this MSE is typically 2--3 dB below the L1-AMP MSE.

ex-ch20-15

Hard

Show that AMP with the identity denoiser $\eta(u)=u$ reduces to pseudo-inverse iteration and is unstable for $\delta < 1$ . Compute the state-evolution recursion for this degenerate case and identify when it has a fixed point.

Show Hint

$\eta'(u) = 1$ , so $\langle\eta'\rangle=1$ and the Onsager coefficient is $1/\delta$ .

The recursion becomes $\tau_{t+1}^2 = \sigma^2 + \delta^{-1}\tau_t^2$ (identity denoiser has MSE $\tau^2$ if $\mathbb{E}[X^2]=0$ , else infinite).

Solution

SE recursion

With $\eta(u)=u$ : $\mathbb{E}[(\eta(X+\tau Z)-X)^2] = \tau^2\mathbb{E}[Z^2] = \tau^2$ . So $\tau_{t+1}^2 = \sigma^2 + \delta^{-1}\tau_t^2$ .

Stability

The fixed point is $\tau_\star^2 = \sigma^2/(1-1/\delta) = \sigma^2\delta/(\delta-1)$ when $\delta > 1$ . For $\delta \le 1$ the recursion diverges — the identity denoiser does not exploit any prior information, so AMP reduces to an underdetermined pseudo-inverse which cannot average down the noise.

ex-ch20-16

Challenge

Prove that, in the large- $N$ limit, running AMP on $\mathbf{A}$ with i.i.d.\ Gaussian entries achieves the replica-symmetric MMSE when $\eta=\eta_{\mathrm{mmse}}$ is matched to the prior. Contrast with LASSO's MSE computed from the same state-evolution machinery.

Show Hint

Use Bayati--Montanari SE convergence + the fact that the MMSE denoiser minimises the scalar MSE at every noise level.

Connection: replica-symmetric MMSE = $\delta(\tau_\star^2-\sigma^2)$ .

Solution

Optimality of MMSE at each step

For any $\tau$ , $\mathbb{E}[(\eta(X+\tau Z)-X)^2]$ is minimised over all measurable $\eta$ by the posterior mean $\eta_{\mathrm{mmse}}(u;\tau^2)$ .

Fixed-point comparison

Denote by $\tau_{\mathrm{B}}^2$ the fixed point achieved with $\eta_{\mathrm{mmse}}$ and by $\tau_{\mathrm{L}}^2$ the fixed point with optimal soft-thresholding. Since at every $\tau$ the MMSE curve is below the L1 curve, the fixed points satisfy $\tau_{\mathrm{B}}^2 \le \tau_{\mathrm{L}}^2$ , with strict inequality whenever the prior is non-Gaussian.

Replica-symmetric interpretation

The replica-symmetric cavity calculation yields an equation identical to the Bayes-AMP state-evolution fixed point (Reeves-- Pfister 2019). Thus the AMP MSE equals the conjectured (and rigorously proved in the i.i.d. Gaussian case) information- theoretic MMSE limit.

ex-ch20-17

Challenge

Design a D-AMP variant using a 3-layer neural denoiser trained on Bernoulli--Gaussian signal plus AWGN at various noise levels. Compare its MSE trajectory to L1-AMP and Bayes-AMP. Explain theoretically why learned denoisers can in principle match Bayes-AMP and why, in practice, they often fall short.

Show Hint

Train a small MLP to map (u, tau) to the posterior mean estimate.

Estimate divergence via Monte Carlo: $\mathrm{div}(D_\tau)(\mathbf{u}) \approx \epsilon^{-1}\,\eta^T(D_\tau(\mathbf{u}+\epsilon\eta)-D_\tau(\mathbf{u}))$ for $\eta \sim \mathcal{N}(\mathbf{0},\mathbf{I})$ .

Solution

Architecture

MLP with input $(u,\tau)$ , two hidden layers of 64 units each, ReLU activations, scalar output. Train with MSE loss on synthetic pairs $(X+\tau Z, X)$ .

D-AMP loop

Use the D-AMP pseudo-code: apply the network to the pseudo-data vector, estimate divergence by the Monte-Carlo rule, update the residual with the estimated divergence.

In-principle optimality

A universal approximator of the posterior mean would make D-AMP equivalent to Bayes-AMP. In practice, (i) the network is trained over a finite set of noise levels, introducing bias for intermediate $\tau$ ; (ii) the MC divergence estimate has variance; (iii) training imbalance favours large signal components. Each introduces a fixed- point gap.

Exercises

ex-ch20-01

Full iteration

ex-ch20-02

Piecewise derivative

SE requirement

ex-ch20-03

SE recursion

Closed-form MSE

ex-ch20-04

Scaling

Convergence condition

ex-ch20-05

Posterior weight

Posterior mean

Derivative

ex-ch20-06

Reduce to Stein's lemma

Unconditional form

ex-ch20-07

Core loop

Expected output

ex-ch20-08

Fixed-point equation

Matching LASSO KKT

Interpretation

ex-ch20-09

Monotonicity of scalar MSE

Monotone iteration

ex-ch20-10

Setup

Observation

ex-ch20-11

SE slope computation

Expected curve

ex-ch20-12

Cavity decomposition

Correction factor

ex-ch20-13

Rewriting as augmented denoiser

Breakdown of SE

Limit $\theta\to 0$

ex-ch20-14

Posterior

Implementation

Empirical check

ex-ch20-15

SE recursion

Stability

ex-ch20-16

Optimality of MMSE at each step

Fixed-point comparison

Replica-symmetric interpretation

ex-ch20-17

Architecture

D-AMP loop

In-principle optimality