Exercises

ex-ch19-01

Easy

(Gaussian GAMP reduces to AMP)

(a) Derive the GAMP output function gout(y,p^,Ο„p)g_{\text{out}}(y, \hat{p}, \tau_p) for the Gaussian likelihood p(y∣z)=N(y;z,Οƒ2)p(y|z) = \mathcal{N}(y; z, \sigma^2).

(b) Show that with i.i.d. Gaussian A\mathbf{A} (MΓ—NM \times N, entries ∼N(0,1/M)\sim \mathcal{N}(0, 1/M)), the GAMP update simplifies to the AMP iteration from Chapter 17.

(c) Implement both algorithms in Python. Run on a sparse recovery problem (N=500N = 500, M=200M = 200, ρ=0.1\rho = 0.1, SNR = 25 dB). Verify that iterates match to numerical precision (NMSE difference <10βˆ’10< 10^{-10}).

ex-ch19-02

Medium

(EM-GAMP for unknown noise variance)

(a) Generate a sparse recovery problem with unknown noise variance: N=500N = 500, M=250M = 250, ρ=0.08\rho = 0.08, Οƒx2=1\sigma_x^2 = 1, Οƒ2=0.01\sigma^2 = 0.01. Initialize Οƒ2^(0)=1.0\hat{\sigma^2}^{(0)} = 1.0 (100Γ— overestimate).

(b) Implement the EM update for Οƒ2\sigma^2: Οƒ2^(k+1)=1Mβˆ₯yβˆ’Ax^βˆ₯2\hat{\sigma^2}^{(k+1)} = \frac{1}{M}\|\mathbf{y} - \mathbf{A}\hat{\mathbf{x}}\|^2.

(c) Run EM-GAMP for 50 outer iterations (1 GAMP step per EM step). Plot Οƒ2^(k)\hat{\sigma^2}^{(k)} and the reconstruction NMSE (dB) vs. kk.

(d) Repeat with Οƒ2^(0)=10βˆ’4\hat{\sigma^2}^{(0)} = 10^{-4} (100Γ— underestimate). Does EM converge from both sides?

(e) Compare the final NMSE of EM-GAMP with (i) oracle GAMP (true Οƒ2\sigma^2) and (ii) GAMP with Οƒ2^=0.1\hat{\sigma^2} = 0.1 (10Γ— overestimate, fixed).

ex-ch19-03

Medium

(GAMP for 1-bit compressed sensing)

(a) Generate a sparse signal (N=500N = 500, ρ=0.1\rho = 0.1) and 1-bit measurements: ym=sign(amTx+dm)y_m = \text{sign}(\mathbf{a}_m^T\mathbf{x} + d_m) where dm∼N(0,0.05)d_m \sim \mathcal{N}(0, 0.05) is the dither.

(b) Implement the probit output function: gout(y,p^,Ο„p)=yβ‹…Ο•(u)/(Ξ¦(yu)+Ο΅)/Οƒ2+Ο„pg_{\text{out}}(y, \hat{p}, \tau_p) = y \cdot \phi(u) / (\Phi(yu) + \epsilon) / \sqrt{\sigma^2 + \tau_p} where u=p^/Οƒ2+Ο„pu = \hat{p}/\sqrt{\sigma^2 + \tau_p}.

(c) Run GAMP with this output function. Compare with (i) treating ymy_m as a continuous Gaussian measurement (mismatched), (ii) BIHT (Binary Iterative Hard Thresholding).

(d) Sweep M/NM/N from 0.5 to 5.0. At what oversampling ratio does 1-bit GAMP achieve NMSE <βˆ’15< -15 dB?

(e) How much extra oversampling is needed compared to standard CS (full-precision measurements at the same SNR)?

ex-ch19-04

Medium

(EM-GAMP with all parameters unknown)

Implement EM-GAMP for simultaneous estimation of (Οƒ2,ρ,Οƒx2)(\sigma^2, \rho, \sigma_x^2).

(a) Use N=1000N = 1000, M=500M = 500. True: ρ=0.08\rho = 0.08, Οƒx2=2.0\sigma_x^2 = 2.0, Οƒ2=0.02\sigma^2 = 0.02. Initial: ρ^=0.5\hat{\rho} = 0.5, Οƒ^x2=0.5\hat{\sigma}_x^2 = 0.5, Οƒ2^=0.5\hat{\sigma^2} = 0.5.

(b) Plot all three parameter estimates vs. EM iteration (30 total), alongside the true values.

(c) Run 20 random initializations. Plot the distribution of final NMSE across runs. Do all initializations converge to the same solution?

(d) Identify the SNR threshold below which EM-GAMP fails to correctly estimate parameters. Test SNR ∈{5,10,15,20,25}\in \{5, 10, 15, 20, 25\} dB.

ex-ch19-05

Hard

(State evolution for GAMP)

(a) Implement the GAMP state evolution (SE) recursion for the Gaussian likelihood with Bernoulli-Gaussian prior. For i.i.d. Gaussian A\mathbf{A}:

Ο„pt+1=Ο„xtΞ΄,Ο„xt+1=EX0,Z ⁣[(gin(X0+Ο„rt Z)βˆ’X0)2],\tau_p^{t+1} = \frac{\tau_x^t}{\delta}, \quad \tau_x^{t+1} = \mathbb{E}_{X_0, Z}\!\left[\left(g_{\text{in}}(X_0 + \sqrt{\tau_r^t}\,Z) - X_0\right)^2\right],

where the expectation is over X0∼p0X_0 \sim p_0 and Z∼N(0,1)Z \sim \mathcal{N}(0,1).

(b) Implement GAMP SE for the probit (1-bit) output channel. Predict MSE vs. M/NM/N for 1-bit CS at ρ=0.1\rho = 0.1.

(c) Run actual GAMP for N=2000N = 2000 and compare with SE predictions. How accurately does SE predict the empirical MSE?

(d) Use GAMP SE to compute the phase transition for 1-bit CS: minimum M/NM/N for NMSE <βˆ’10< -10 dB, as a function of sparsity ρ∈[0.05,0.4]\rho \in [0.05, 0.4].

(e) Compare the 1-bit phase transition with the standard (full-precision) phase transition. What is the "quantization penalty" in required measurements?

ex-ch19-06

Hard

(Multi-layer inference for dictionary learning)

(a) Generate a dictionary D∈R50Γ—100\mathbf{D} \in \mathbb{R}^{50 \times 100} (i.i.d. Gaussian, column-normalized) and L=200L = 200 sparse coefficient vectors sβ„“\mathbf{s}_\ell (ρ=0.1\rho = 0.1, i.i.d. BG). Observations: Y=DS+W\mathbf{Y} = \mathbf{D}\mathbf{S} + \mathbf{W} (Οƒ2=0.01\sigma^2 = 0.01).

(b) Implement simplified BiG-AMP: alternate between (i) GAMP to estimate S\mathbf{S} given D^\hat{\mathbf{D}}, (ii) gradient step to update D^\hat{\mathbf{D}} given S^\hat{\mathbf{S}}.

(c) Initialize with random D^\hat{\mathbf{D}} (column-normalized Gaussian). Run for 100 outer iterations. Plot the dictionary recovery error min⁑Pβˆ₯D^βˆ’DPβˆ₯F\min_{\mathbf{P}} \|\hat{\mathbf{D}} - \mathbf{D}\mathbf{P}\|_F vs. iteration.

(d) Vary LL from 50 to 500. How many observations are needed for reliable dictionary recovery (<5%< 5\% column error)?

(e) Compare with K-SVD on the same problem.

ex-ch19-07

Hard

(EM-GAMP for RF imaging with unknown noise)

Build a complete EM-GAMP pipeline for RF imaging with a physical sensing matrix.

(a) System: Nt=4N_t = 4, Nr=8N_r = 8, Nf=32N_f = 32 frequencies. Sensing matrix A\mathbf{A}: frequency-domain MIMO steering matrix (not i.i.d. Gaussian). Scene: 8Γ—88 \times 8 grid, Bernoulli-Gaussian, ρ=0.1\rho = 0.1.

(b) EM-GAMP: Implement with Gaussian likelihood, learning (Οƒ2,ρ,Οƒx2)(\sigma^2, \rho, \sigma_x^2). Note: since A\mathbf{A} is structured (not i.i.d. Gaussian), use VAMP (Chapter 18) as the inner solver rather than AMP.

(c) Comparison: Test (i) EM-GAMP, (ii) GAMP oracle (true parameters), (iii) GAMP cross-validated, (iv) FISTA with cross-validated Ξ»\lambda.

(d) Noise robustness: Sweep SNR ∈{5,10,15,20,25,30}\in \{5, 10, 15, 20, 25, 30\} dB. Plot NMSE and parameter estimation error vs. SNR for all methods.

(e) Analysis: At what SNR does EM-GAMP gain more than 3 dB over mismatched GAMP? What is the dominant source of performance degradation at low SNR?

ex-ch19-08

Challenge

(Poisson GAMP for photon-limited sensing)

(a) Model a photon-counting RF/optical sensing system: ym∼Poisson(λm)y_m \sim \text{Poisson}(\lambda_m) where λm=exp⁑(amTx+b)\lambda_m = \exp(\mathbf{a}_m^T\mathbf{x} + b), b=1b = 1 (background intensity).

(b) Derive the Poisson output function via Laplace approximation: z^postβ‰ˆp^+Ο„p(yβˆ’ep^+Ο„p/2)\hat{z}_{\text{post}} \approx \hat{p} + \tau_p(y - e^{\hat{p} + \tau_p/2}) and implement goutg_{\text{out}}.

(c) Generate a non-negative sparse signal (N=500N = 500, ρ=0.15\rho = 0.15, entries ∼Exp(1)\sim \text{Exp}(1)) and Poisson measurements. Run GAMP with Bernoulli-Exponential prior (non-negative support).

(d) Compare with (i) standard GAMP treating ymy_m as Gaussian with variance Ξ»m\lambda_m (known mean approximation), (ii) standard GAMP treating ymy_m as Gaussian with variance ymy_m (empirical approximation).

(e) Vary the background intensity bb from 0 to 3. At what background level does the Gaussian approximation become adequate (NMSE gap <1< 1 dB)?

ex-ch19-09

Easy

(Identifying valid GAMP output channels)

For each of the following measurement models, state whether GAMP can be applied directly (i.e., whether goutg_{\text{out}} can be computed in closed form or efficiently approximated) and write the key formula for goutg_{\text{out}}:

(a) p(y∣z)=N(y;z,Οƒ2)p(y|z) = \mathcal{N}(y; z, \sigma^2) (additive Gaussian).

(b) p(y∣z)=Bernoulli(Οƒ(z))p(y|z) = \text{Bernoulli}(\sigma(z)) where Οƒ(z)=(1+eβˆ’z)βˆ’1\sigma(z) = (1+e^{-z})^{-1}.

(c) p(y∣z)=Poisson(ez)p(y|z) = \text{Poisson}(e^z).

(d) p(y∣z)=Ξ΄(yβˆ’βˆ£z∣2)p(y|z) = \delta(y - |z|^2) (noiseless power measurement).

(e) p(y∣z)p(y|z) is a Gaussian mixture: βˆ‘kwkN(y;z,Οƒk2)\sum_k w_k \mathcal{N}(y; z, \sigma^2_{k}).

ex-ch19-10

Easy

(EM-GAMP M-step derivation)

Starting from the Q-function:

Q(ΞΈ)=βˆ’Mlog⁑(πσ2)βˆ’1Οƒ2Eqβˆ₯yβˆ’Axβˆ₯2+βˆ‘iEq[log⁑p0(xi;ρ,Οƒx2)],Q(\boldsymbol{\theta}) = -M\log(\pi\sigma^2) - \frac{1}{\sigma^2}\mathbb{E}_q\|\mathbf{y} - \mathbf{A}\mathbf{x}\|^2 + \sum_i \mathbb{E}_q[\log p_0(x_i; \rho, \sigma_x^2)],

where q(x)=∏iN(xi;x^i,Ο„x,i)q(\mathbf{x}) = \prod_i \mathcal{N}(x_i; \hat{x}_i, \tau_{x,i}):

(a) Derive Οƒ2^=arg⁑max⁑σ2Q(ΞΈ)\hat{\sigma^2} = \arg\max_{\sigma^2} Q(\boldsymbol{\theta}).

(b) Show that Eqβˆ₯yβˆ’Axβˆ₯2=βˆ₯yβˆ’Ax^βˆ₯2+βˆ‘iΟ„x,iβˆ₯[A]:,iβˆ₯2\mathbb{E}_q\|\mathbf{y} - \mathbf{A}\mathbf{x}\|^2 = \|\mathbf{y} - \mathbf{A}\hat{\mathbf{x}}\|^2 + \sum_i \tau_{x,i}\|[\mathbf{A}]_{:,i}\|^2.

(c) For a Bernoulli-Gaussian prior, show that the M-step for ρ\rho gives ρ^=1Nβˆ‘iΟ€i\hat{\rho} = \frac{1}{N}\sum_i \pi_i where Ο€i\pi_i is the posterior inclusion probability from the GAMP E-step.

ex-ch19-11

Medium

(GAMP damping and convergence)

GAMP's convergence can be improved by damping: replacing the update s^mt+1←gout\hat{s}_m^{t+1} \leftarrow g_{\text{out}} with s^mt+1←α gout+(1βˆ’Ξ±)s^mt\hat{s}_m^{t+1} \leftarrow \alpha\,g_{\text{out}} + (1-\alpha)\hat{s}_m^t.

(a) Implement GAMP with damping parameter α∈(0,1]\alpha \in (0, 1] for the 1-bit probit output channel.

(b) Run for N=300N = 300, M=250M = 250, ρ=0.1\rho = 0.1, 50 iterations. Test α∈{0.3,0.5,0.7,1.0}\alpha \in \{0.3, 0.5, 0.7, 1.0\} (undamped).

(c) Plot the residual βˆ₯p^tβˆ’p^tβˆ’1βˆ₯2\|\hat{p}^t - \hat{p}^{t-1}\|_2 vs. iteration for each Ξ±\alpha. Which value gives the best convergence?

(d) Explain why Ξ±=1.0\alpha = 1.0 (undamped) can diverge for the probit model but not for the Gaussian model.

ex-ch19-12

Medium

(Multi-layer model: dimensionality and reconstruction)

Consider a two-layer generative model: c=ReLU(Dz)\mathbf{c} = \text{ReLU}(\mathbf{D}\mathbf{z}) where D∈RNΓ—K\mathbf{D} \in \mathbb{R}^{N \times K} (known dictionary) and z∼N(0,IK)\mathbf{z} \sim \mathcal{N}(\mathbf{0}, \mathbf{I}_K).

(a) For N=256N = 256, K=64K = 64, M=50M = 50: can the latent code z\mathbf{z} be recovered? State the information-theoretic condition.

(b) Generate z\mathbf{z}, compute c\mathbf{c}, and observe y=Ac+w\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w} with M=50M = 50. Attempt recovery by: (i) optimizing z\mathbf{z} to minimize βˆ₯yβˆ’A ReLU(Dz)βˆ₯2\|\mathbf{y} - \mathbf{A}\,\text{ReLU}(\mathbf{D}\mathbf{z})\|^2, (ii) using standard GAMP on the full scene c\mathbf{c} (ignores structure).

(c) Compare NMSE for both methods. Explain the difference.

(d) What happens when K>MK > M? When K=NK = N?

ex-ch19-13

Medium

(GAMP vs VAMP for physical sensing matrices)

A practical RF imaging system uses a MIMO sensing matrix A=FfreqβŠ—Garray\mathbf{A} = \mathbf{F}_{\text{freq}} \otimes \mathbf{G}_{\text{array}} (Kronecker product, not i.i.d. Gaussian).

(a) Explain why standard GAMP state evolution does not apply to this matrix.

(b) Generate a 128Γ—256128 \times 256 Kronecker sensing matrix. Run (i) AMP, (ii) VAMP on the same sparse recovery problem (ρ=0.1\rho = 0.1, SNR = 20 dB).

(c) Plot NMSE vs. iteration for AMP and VAMP. Which converges faster and why?

(d) For this matrix, apply EM-GAMP (using VAMP as inner solver). Does the EM M-step still apply in the same form?

(e) What is the computational saving of using Kronecker VAMP (FFT-based) vs. dense SVD-based VAMP for Nt=8N_t = 8, Nr=8N_r = 8, Nf=64N_f = 64?

ex-ch19-14

Hard

(EM for Gaussian mixture prior)

The Gaussian mixture (GM) prior generalizes BG:

p0(x)=βˆ‘β„“=1Lwℓ CN(x;ΞΌβ„“,Οƒβ„“2).p_0(x) = \sum_{\ell=1}^{L} w_\ell\,\mathcal{CN}(x; \mu_\ell, \sigma_\ell^2).

(a) Derive the EM M-step for the mixture weights {wβ„“}\{w_\ell\}, means {ΞΌβ„“}\{\mu_\ell\}, and variances {Οƒβ„“2}\{\sigma_\ell^2\}.

(b) Implement EM-GM-GAMP with L=3L = 3 components for a sparse signal with non-zero entries drawn from a Gaussian mixture (two Gaussians at Β±2\pm 2 plus a small cluster near zero).

(c) Compare with EM-BG-GAMP (Bernoulli-Gaussian, L=2L=2 effectively). Which achieves lower NMSE when the prior is correctly specified?

(d) What happens when EM-BG-GAMP is applied to data from a GM prior (prior mismatch)? Quantify the NMSE penalty.

ex-ch19-15

Challenge

(Complete self-tuning RF imaging system)

Design and implement a complete self-tuning Bayesian RF imaging system combining all three elements from this chapter.

System specification:

  • MIMO radar: Nt=4N_t = 4, Nr=8N_r = 8, Nf=32N_f = 32 frequencies.
  • Scene: 16Γ—1616 \times 16 grid (N=256N = 256 voxels), BG prior.
  • Physical sensing matrix (Kronecker structure).

(a) Implement EM-VAMP (VAMP as inner solver, EM outer loop) to learn (Οƒ2,ρ,Οƒx2)(\sigma^2, \rho, \sigma_x^2) automatically.

(b) Extend to 1-bit receivers: replace the Gaussian likelihood with the probit output channel. How does the EM M-step for Οƒ2\sigma^2 change (hint: Οƒ2\sigma^2 is now the dither variance)?

(c) Test at SNR ∈{5,10,15,20}\in \{5, 10, 15, 20\} dB with both full-precision and 1-bit receivers. Plot NMSE vs. SNR for: EM-VAMP (full), EM-VAMP (1-bit), oracle VAMP (full), oracle VAMP (1-bit).

(d) For the 1-bit case, does EM converge to the correct Οƒ2\sigma^2 (dither variance)? Or does EM learn a different effective noise level?

(e) Quantify the information loss from 1-bit quantization as a function of SNR: plot the "measurement overhead" (extra M/NM/N needed to match full-precision NMSE) vs. SNR.