GAMP β€” Generalized Approximate Message Passing

Beyond Gaussian Measurements

AMP, OAMP, and VAMP all assume Gaussian measurement noise: \ntnobs=Ax+w\ntn{obs} = \mathbf{A}\mathbf{x} + \mathbf{w} with w∼CN(0,Οƒ2I)\mathbf{w} \sim \mathcal{CN}(\mathbf{0},\sigma^2\mathbf{I}). But many measurements in the real world are manifestly non-Gaussian: Poisson photon counts in astronomy and medical imaging, 1-bit or few-bit quantized samples from low-power ADCs, log-linear counts in spectroscopy, and logistic outcomes in classification.

GAMP (Generalized AMP) extends the AMP framework to generalized linear models (GLMs), where each measurement depends on zi=⟨ai,x⟩z_i = \langle \mathbf{a}_i,\mathbf{x}\rangle through an arbitrary per-element likelihood p(yi∣zi)p(y_i|z_i). The algorithm retains AMP's two-scalar state-evolution structure but adds a new component β€” an output channel denoiser β€” that handles the non-Gaussian likelihood.

The beauty of GAMP is that it treats the nonlinearity at the measurement end and the prior at the signal end symmetrically: both are handled by local MMSE steps, one in signal space, one in measurement space.

Definition:

Generalized Linear Model (GLM)

A generalized linear model is a Markov chain x→z→y\mathbf{x} \to \mathbf{z} \to \mathbf{y} where z=Ax\mathbf{z} = \mathbf{A}\mathbf{x} is the linear image of the signal, and the measurement is produced element-wise through a fixed conditional distribution

p(y∣z)=∏i=1Mp(yi∣zi).p(\mathbf{y}|\mathbf{z}) = \prod_{i=1}^{M} p(y_i|z_i).

The likelihood family p(β‹…βˆ£β‹…)p(\cdot|\cdot) characterizes the GLM:

  • p(y∣z)=N(y;z,Οƒ2)p(y|z) = \mathcal{N}(y;z,\sigma^2) β†’ standard linear regression;
  • p(y∣z)=Poisson(Ξ»=ez)p(y|z) = \mathrm{Poisson}(\lambda = e^z) β†’ Poisson regression;
  • p(y∣z)=Bernoulli(sigmoid(z))p(y|z) = \mathrm{Bernoulli}(\mathrm{sigmoid}(z)) β†’ logistic regression;
  • p(y∣z)=1{y=sign(z+w)}p(y|z) = \mathbb{1}\{y = \mathrm{sign}(z+w)\} β†’ 1-bit compressed sensing.

The GLM abstraction is a modelling unifier: once GAMP knows how to handle any such per-element likelihood, a vast range of inverse problems become instances of a single algorithm with a user-supplied output channel denoiser.

Definition:

GAMP Iteration

With A\mathbf{A} having i.i.d. zero-mean entries with variance 1/M1/M, GAMP iterates

output:z^i(t)=[Ax^(t)]iβˆ’Ο„p(t)s^i(tβˆ’1),s^i(t)=gout(yi,z^i(t),Ο„p(t)),Ο„s(t)=βˆ’βˆ‚zgoutΒ avg,input:r^j(t)=x^j(t)+Ο„r(t)[AHs^(t)]j,x^j(t+1)=gin(r^j(t),Ο„r(t)),Ο„r(t+1)Β fromΒ Ο„s(t).\begin{aligned} \text{output:} \quad & \hat{z}_i^{(t)} = [\mathbf{A}\hat{\mathbf{x}}^{(t)}]_i - \tau_p^{(t)} \hat{s}_i^{(t-1)}, \\ & \hat{s}_i^{(t)} = g_{\text{out}}(y_i, \hat{z}_i^{(t)}, \tau_p^{(t)}), \quad \tau_s^{(t)} = -\partial_z g_{\text{out}}\text{ avg}, \\ \text{input:} \quad & \hat{r}_j^{(t)} = \hat{x}_j^{(t)} + \tau_r^{(t)} [\mathbf{A}^{\mathsf{H}}\hat{\mathbf{s}}^{(t)}]_j, \\ & \hat{x}_j^{(t+1)} = g_{\text{in}}(\hat{r}_j^{(t)}, \tau_r^{(t)}), \quad \tau_r^{(t+1)} \text{ from } \tau_s^{(t)}. \end{aligned}

The output channel denoiser gout(y,z^,Ο„p)g_{\text{out}}(y,\hat{z},\tau_p) is the MMSE estimate of the score of the likelihood,

gout(y,z^,Ο„p)=1Ο„p(E[z∣y,z^,Ο„p]βˆ’z^),g_{\text{out}}(y,\hat{z},\tau_p) = \frac{1}{\tau_p}\left(\mathbb{E}[z|y,\hat{z},\tau_p] - \hat{z}\right),

where the expectation is over z∼N(z^,Ο„p)z \sim \mathcal{N}(\hat{z},\tau_p) weighted by p(y∣z)p(y|z). The input denoiser ging_{\text{in}} is the familiar signal-prior MMSE estimator.

Setting p(y∣z)=N(y;z,Οƒ2)p(y|z) = \mathcal{N}(y;z,\sigma^2) recovers standard AMP with gout(y,z^,Ο„p)=(yβˆ’z^)/(Οƒ2+Ο„p)g_{\text{out}}(y,\hat{z},\tau_p) = (y - \hat{z})/(\sigma^2 + \tau_p). Every other likelihood simply changes this scalar Bayesian update.

Theorem: GAMP State Evolution

For i.i.d. sub-Gaussian sensing matrices A\mathbf{A} and any GLM likelihood with bounded output MMSE, GAMP's iterates admit, in the large-system limit, a scalar state evolution

Ο„r(t+1)=Fin(Ο„p(t);pX),Ο„p(t+1)=Fout(Ο„r(t+1);p(y∣z),pZ),\tau_r^{(t+1)} = \mathcal{F}_{\text{in}}(\tau_p^{(t)}; p_X), \qquad \tau_p^{(t+1)} = \mathcal{F}_{\text{out}}(\tau_r^{(t+1)}; p(y|z), p_Z),

where Fin\mathcal{F}_{\text{in}} is the denoiser MSE under effective noise Ο„r\tau_r, and Fout\mathcal{F}_{\text{out}} is the Fisher information of the scalar channel z∼N(z^,Ο„r)z \sim \mathcal{N}(\hat{z},\tau_r), y∼p(y∣z)y \sim p(y|z).

Once again the whole algorithm collapses to two scalar numbers at each iteration β€” the effective noise variance on the signal side (Ο„r\tau_r) and on the measurement side (Ο„p\tau_p). The presence of two variances (rather than one) reflects that GLM inference is a two-stage problem: decode the linear image from the measurements, then decode the signal from the linear image.

,

Example: GAMP for 1-Bit Compressed Sensing

Consider 1-bit compressed sensing: the true signal x\mathbf{x} is sparse, measurements are taken as zi=⟨ai,x⟩+wiz_i = \langle\mathbf{a}_i,\mathbf{x}\rangle + w_i with wi∼N(0,Οƒw2)w_i \sim \mathcal{N}(0,\sigma_w^2), and observed as yi=sign(zi)∈{βˆ’1,+1}y_i = \mathrm{sign}(z_i) \in \{-1,+1\}. Derive the output channel denoiser goutg_{\text{out}}.

Definition:

EM-GAMP for Automatic Hyperparameter Tuning

EM-GAMP interleaves GAMP iterations with EM updates of unknown hyperparameters ΞΈ=(ρ,Οƒx2,Οƒ2)\boldsymbol{\theta} = (\rho, \sigma_x^2, \sigma^2):

ΞΈ(t+1)=arg⁑max⁑θ Ep(x∣\ntnobs;ΞΈ(t))[log⁑p(x,\ntnobs;ΞΈ)],\boldsymbol{\theta}^{(t+1)} = \arg\max_{\boldsymbol{\theta}} \, \mathbb{E}_{p(\mathbf{x}|\ntn{obs};\boldsymbol{\theta}^{(t)})}[\log p(\mathbf{x},\ntn{obs};\boldsymbol{\theta})],

where the expectation over the posterior is approximated by the marginals produced at the end of the last GAMP run. The E-step is free β€” GAMP gives the marginal moments β€” and the M-step reduces to closed-form updates for common parametric priors.

In practice one runs KK GAMP iterations, updates ΞΈ\boldsymbol{\theta} once, and repeats. Convergence is typically reached in 5-20 outer iterations. EM-GAMP is a workhorse when priors are unknown but parametric.

1-Bit Compressed Sensing with GAMP

Recover a sparse signal from 1-bit quantized measurements as a function of the oversampling ratio M/NM/N. Compare GAMP against the AMP baseline that treats 1-bit measurements as real-valued (a model mismatch).

Parameters
400
0.1
20

Output Denoisers for Common GLMs

Visualize the output channel denoiser gout(y,z^,Ο„p)g_{\text{out}}(y,\hat{z},\tau_p) as a function of z^\hat{z} for three common likelihoods: Gaussian, Poisson, and 1-bit. Notice how the 1-bit denoiser saturates while the Gaussian one is linear.

Parameters
1

Effective noise variance on $z$

1
🚨Critical Engineering Note

GAMP Is More Fragile Than OAMP

Even on i.i.d. Gaussian sensing matrices, GAMP frequently diverges without damping. The two coupled transfer functions (Fin,Fout)(\mathcal{F}_{\text{in}}, \mathcal{F}_{\text{out}}) can create slowly converging or oscillating loops, particularly for heavy-tailed priors and heavily nonlinear output channels (1-bit, Poisson with low counts).

Standard remedies:

  • Damping on both scalar states with factor Ξ²β‰ˆ0.5\beta \approx 0.5–0.80.8.
  • Variance flooring: clamp Ο„p,Ο„r\tau_p, \tau_r away from zero.
  • "Mean-only" damping: damp the vector iterates x^,s^\hat{\mathbf{x}}, \hat{\mathbf{s}} but not the precisions.

For structured sensing matrices, the RRI equivalents of GAMP ("GVAMP") are more robust and should be preferred in production.

Common Mistake: Output Channel Sign Convention

Mistake:

Confusing the sign of goutg_{\text{out}} with the sign of the residual yβˆ’z^y-\hat{z}. For Gaussian channels they agree; for Poisson or 1-bit channels the scaling is different and the sign can differ relative to the assumed operating point.

Correction:

Derive goutg_{\text{out}} directly from the definition gout(y,z^,Ο„p)=(E[z∣y,z^,Ο„p]βˆ’z^)/Ο„pg_{\text{out}}(y,\hat{z},\tau_p) = (\mathbb{E}[z|y,\hat{z},\tau_p] - \hat{z})/\tau_p and verify on a simple test case (scalar Gaussian likelihood with known SNR) that the output matches the closed-form solution. Then trust the derivation in the generalized case.

Historical Note: Rangan and the Birth of GAMP

2011-2013

Sundeep Rangan's 2011 ISIT paper Generalized Approximate Message Passing for Estimation with Random Linear Mixing marked the transition of AMP from a specialized sparse-recovery algorithm to a general-purpose Bayesian inference engine. The key insight β€” that the Onsager-corrected residual on the measurement side could be reused for any generalized linear model β€” opened the door to a flood of applications in quantized sensing, phase retrieval, and machine learning.

The subsequent EM-GAMP variant (Vila & Schniter, 2013) made GAMP parameter-free, eliminating the need to specify the prior or noise level in advance. For a decade GAMP was the default engine for Bayesian compressed sensing in applications ranging from hyperspectral unmixing to quantitative MRI.

GAMP (Generalized AMP)

An extension of AMP to generalized linear models with per-element likelihoods. Adds an output-channel denoiser to handle non-Gaussian measurements (1-bit, Poisson, etc.) while preserving the scalar state-evolution structure of AMP.

Related: Generalized linear model (GLM), EM-GAMP

Generalized linear model (GLM)

A statistical model where the measurement yiy_i depends on a linear image zi=⟨ai,x⟩z_i = \langle\mathbf{a}_i,\mathbf{x}\rangle only through a fixed per-element likelihood p(yi∣zi)p(y_i|z_i). Generalizes linear regression to Poisson, logistic, quantized, and many other observation models.

Related: GAMP (Generalized AMP)

EM-GAMP

A variant of GAMP that alternates message-passing iterations with EM updates of the prior and noise hyperparameters. Enables automatic tuning without prior specification of ρ,Οƒx2\rho, \sigma_x^2, or Οƒ2\sigma^2.

Related: GAMP (Generalized AMP)

Quick Check

In GAMP, which of the following determines the output denoiser goutg_{\text{out}}?

The signal prior pX(x)p_X(\mathbf{x}) only.

The per-element measurement likelihood p(y∣z)p(y|z) and the current state Ο„p\tau_p.

The structure of the sensing matrix A\mathbf{A}.

The signal sparsity ρ\rho.

Why This Matters: 1-Bit GAMP in Massive MIMO

Massive MIMO uplinks with 1-bit ADCs are a rare wireless setting where a non-Gaussian likelihood dominates the receiver design. Each antenna produces a binary sample per sampling instant, representing the sign of a complex-valued received signal plus noise. Channel estimation in this regime is exactly the 1-bit compressed-sensing problem, and GAMP with the 1-bit output denoiser is competitive with much more expensive algorithms.

The CommIT group's activity-detection pipeline (Ch 14) leverages this observation: the two-step GAMP structure β€” output denoiser for the quantized measurement, input denoiser for the sparse user activity β€” gives a principled low-complexity receiver that the group has extended to unsourced random access with large numbers of potential users.

See full treatment in Chapter 14، Section sec-activity-detection