EM-GAMP for RF Imaging

The Hyperparameter Problem in Bayesian Inference

The VAMP algorithm from Chapter 18 achieves near-optimal reconstruction when the prior distribution is correctly specified — but this requires knowing the sparsity rate ρ\rho, the signal variance σx2\sigma_x^2, and the noise variance σ2\sigma^2.

In any real RF imaging deployment, these parameters are unknown and vary with scene statistics and hardware noise floor. Manual tuning by cross-validation requires running the reconstruction many times and does not scale.

EM-GAMP solves this by treating the signal c\mathbf{c} as a latent variable and the hyperparameters θ=(σ2,ρ,σx2)\boldsymbol{\theta} = (\sigma^2, \rho, \sigma_x^2) as unknowns to be estimated by the Expectation-Maximization (EM) algorithm. The result is a self-tuning Bayesian algorithm that converges to near-oracle performance from arbitrary initialization.

Historical Note: Origins of Expectation-Maximization

1977–2014

The EM algorithm was formally unified by Dempster, Laird, and Rubin in their landmark 1977 JRSS-B paper, though special cases had been used for decades (e.g., Baum–Welch for HMMs, 1970). The key insight was the Q-function — the expected complete-data log-likelihood — which EM monotonically maximizes, guaranteeing non-decreasing marginal likelihood.

The integration of EM with approximate message passing for sparse recovery was developed independently by Vila & Schniter (EM-GM-GAMP, 2013) and Kamilov et al. (parametric GAMP, 2014), who proved consistency of the EM parameter estimates in the large-system limit.

Definition:

The EM-GAMP Model

The EM-GAMP model couples the RF imaging observation model with parameterized priors:

y=Ac+w,wCN(0,σ2I),\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w}, \quad \mathbf{w} \sim \mathcal{CN}(\mathbf{0}, \sigma^2\mathbf{I}),

xip0(xi;θprior)=(1ρ)δ(xi)+ρCN(xi;0,σx2),x_i \sim p_0(x_i; \boldsymbol{\theta}_{\text{prior}}) = (1 - \rho)\,\delta(x_i) + \rho\,\mathcal{CN}(x_i; 0, \sigma_x^2),

where θ=(σ2,ρ,σx2)\boldsymbol{\theta} = (\sigma^2, \rho, \sigma_x^2) is the hyperparameter vector.

The complete-data likelihood (treating c\mathbf{c} as known) is:

p(y,c;θ)=CN(y;Ac,σ2I)i=1Np0(xi;θprior).p(\mathbf{y}, \mathbf{c}; \boldsymbol{\theta}) = \mathcal{CN}(\mathbf{y}; \mathbf{A}\mathbf{c}, \sigma^2\mathbf{I}) \cdot \prod_{i=1}^{N} p_0(x_i; \boldsymbol{\theta}_{\text{prior}}).

EM maximizes the marginal likelihood p(y;θ)p(\mathbf{y}; \boldsymbol{\theta}) using GAMP's posterior statistics as a surrogate for the intractable posterior p(cy;θ)p(\mathbf{c} \mid \mathbf{y}; \boldsymbol{\theta}).

EM-GAMP Algorithm

Complexity: O(MNTinnerKEM)O(MN \cdot T_{\text{inner}} \cdot K_{\text{EM}}) — same order as GAMP
Input: y\mathbf{y}, A\mathbf{A}, initial θ^(0)\hat{\boldsymbol{\theta}}^{(0)}
Output: Scene estimate c^\hat{\mathbf{c}}, hyperparameters θ^\hat{\boldsymbol{\theta}}
for k=0,1,2,k = 0, 1, 2, \ldots until convergence do
E-step (GAMP inner loop):
Run GAMP with fixed θ^(k)\hat{\boldsymbol{\theta}}^{(k)} for TinnerT_{\text{inner}} iterations:
- Initialize x^0=0\hat{\mathbf{x}}^0 = \mathbf{0}, τx0=ρ^σ^x2\tau_x^0 = \hat{\rho}\,\hat{\sigma}_x^2
- for t=0,,Tinnert = 0, \ldots, T_{\text{inner}} do
1. p^mtamTx^tτpts^mt1\hat{p}_m^t \leftarrow \mathbf{a}_m^T\hat{\mathbf{x}}^t - \tau_p^t\,\hat{s}_m^{t-1};
τptτxtNaˉ2/M\tau_p^t \leftarrow \tau_x^t \cdot N \cdot \bar{a}^2 / M
2. s^mtgout(ym,p^mt,τpt)\hat{s}_m^t \leftarrow g_{\text{out}}(y_m, \hat{p}_m^t, \tau_p^t);
τstgout/p^\tau_s^t \leftarrow -\partial g_{\text{out}}/\partial\hat{p}
3. r^it+1x^it+τrt+1(ATs^t)i\hat{r}_i^{t+1} \leftarrow \hat{x}_i^t + \tau_r^{t+1}(\mathbf{A}^{T}\hat{\mathbf{s}}^t)_i
4. x^it+1gin(r^it+1,τrt+1)\hat{x}_i^{t+1} \leftarrow g_{\text{in}}(\hat{r}_i^{t+1}, \tau_r^{t+1});
record τx,it+1\tau_{x,i}^{t+1}, πit+1\pi_i^{t+1}
M-step (closed-form updates):
σ2^(k+1)1MyAx^2\hat{\sigma^2}^{(k+1)} \leftarrow \frac{1}{M}\|\mathbf{y} - \mathbf{A}\hat{\mathbf{x}}\|^2
ρ^(k+1)1Niπi\hat{\rho}^{(k+1)} \leftarrow \frac{1}{N}\sum_{i}\pi_i
σ^x2(k+1)iπi(x^i2+τx,i)iπi\hat{\sigma}_x^{2(k+1)} \leftarrow \frac{\sum_i \pi_i(\hat{x}_i^2 + \tau_{x,i})}{\sum_i \pi_i}
end for

In practice, Tinner=1T_{\text{inner}} = 1 (interleaved EM) works well and is the most common choice. Interleaving avoids running GAMP to convergence at each EM step, which is expensive and unnecessary.

Theorem: Closed-Form EM M-Step Updates

Let x^=(x^1,,x^N)\hat{\mathbf{x}} = (\hat{x}_1, \ldots, \hat{x}_N), τx=(τx,1,,τx,N)\boldsymbol{\tau}_x = (\tau_{x,1}, \ldots, \tau_{x,N}), and π=(π1,,πN)\boldsymbol{\pi} = (\pi_1, \ldots, \pi_N) be the posterior statistics from the GAMP E-step. Then the M-step maximizes:

Q(θθ^(k))Eq(x)[logp(y,x;θ)]Q(\boldsymbol{\theta} \mid \hat{\boldsymbol{\theta}}^{(k)}) \triangleq \mathbb{E}_{q(\mathbf{x})}[\log p(\mathbf{y}, \mathbf{x}; \boldsymbol{\theta})]

under the factored Gaussian approximation q(x)=iN(xi;x^i,τx,i)q(\mathbf{x}) = \prod_i \mathcal{N}(x_i; \hat{x}_i, \tau_{x,i}). The closed-form solutions are:

σ2^(k+1)=1M[yAx^2+iτx,iA:,i2],\hat{\sigma^2}^{(k+1)} = \frac{1}{M} \left[\|\mathbf{y} - \mathbf{A}\hat{\mathbf{x}}\|^2 + \sum_i \tau_{x,i}\|\mathbf{A}_{:,i}\|^2\right],

ρ^(k+1)=1Ni=1Nπi,σ^x2(k+1)=iπi(x^i2+τx,i)iπi.\hat{\rho}^{(k+1)} = \frac{1}{N}\sum_{i=1}^N \pi_i, \quad \hat{\sigma}_x^{2(k+1)} = \frac{\sum_i \pi_i (\hat{x}_i^2 + \tau_{x,i})} {\sum_i \pi_i}.

The noise variance update is a bias-corrected residual: the raw squared residual yAx^2\|y - A\hat{x}\|^2 underestimates the true noise power because GAMP's estimate is not exactly the signal. The correction term iτx,iai2\sum_i \tau_{x,i}\|a_i\|^2 accounts for the uncertainty in x^\hat{x}.

,

EM-GAMP Hyperparameter Convergence

Traces of estimated σ2^\hat{\sigma^2}, ρ^\hat{\rho}, and σ^x2\hat{\sigma}_x^2 versus EM outer iteration (left axis). The NMSE (dB) is shown on the right axis. Dashed horizontal lines mark the true parameter values.

Observe: EM converges within 10–20 iterations to near-oracle NMSE regardless of initialization. Select all to see joint estimation; the noise and sparsity estimates interact — at low SNR they can trade off against each other.

Parameters
20

Example: EM-GAMP for Sparse RF Scene with Unknown Parameters

Setup: N=400N = 400 scene voxels, M=200M = 200 measurements (δ=0.5\delta = 0.5), Bernoulli-Gaussian prior with ρ=0.10\rho = 0.10, σx2=1.0\sigma_x^2 = 1.0, σ2=0.01\sigma^2 = 0.01 (SNR \approx 20 dB).

Initial guesses: ρ^(0)=0.5\hat{\rho}^{(0)} = 0.5, σ^x2(0)=0.5\hat{\sigma}_x^{2(0)} = 0.5, σ2^(0)=0.5\hat{\sigma^2}^{(0)} = 0.5 (all overestimated by 5–50×).

After running EM-GAMP for 25 outer iterations (with 1 GAMP step per EM step), report the estimated parameters and the reconstruction NMSE.

Common Mistake: EM Can Converge to Local Optima at Low SNR

Mistake:

The EM objective function logp(y;θ)\log p(\mathbf{y}; \boldsymbol{\theta}) is generally non-convex in θ\boldsymbol{\theta}. Multiple distinct parameter settings can explain the same data:

  • High noise + low sparsity (many small scatterers) vs.
  • Low noise + high sparsity (few strong scatterers).

At SNR <5< 5 dB, EM may converge to the wrong mode, producing a poor estimate.

Correction:

Practical remedies: (1) Multiple restarts: run EM from 3–5 random initializations; select the solution with highest marginal likelihood. (2) Warm start: initialize σ2^\hat{\sigma^2} from the residual energy of a simple matched filter. (3) Physical constraints: bound ρ[0.01,0.5]\rho \in [0.01, 0.5] and σ2>0\sigma^2 > 0.

Common Mistake: Running Full GAMP to Convergence at Each EM Step Is Wasteful

Mistake:

A common implementation error is to run GAMP to full convergence at each EM outer iteration before updating θ\boldsymbol{\theta}. This is correct in theory but wasteful in practice: GAMP iterates in the wrong parameter regime at early EM iterations.

Correction:

Use interleaved EM: one GAMP iteration per EM update. This is equivalent to gradient ascent on the EM objective and converges at the same rate as full EM for well-behaved problems. The implementation simply nests the EM M-step inside the GAMP loop.

⚠️Engineering Note

EM-GAMP Calibration in Practice

In real RF imaging systems, the noise variance σ2\sigma^2 varies with:

  • Transmit power and propagation path loss (accounted for in the link budget).
  • Receiver noise figure (typically 3–10 dB above thermal noise floor).
  • Clutter — returns from unwanted scene elements not in the model.

The sparsity rate ρ\rho depends on the scene type:

  • Urban scenes: ρ0.05\rho \approx 0.050.150.15.
  • Cluttered environments: ρ0.2\rho \approx 0.20.40.4.

EM-GAMP adapts to these variations automatically, but its adaptation rate is limited: for rapidly varying clutter, the single-snapshot EM assumption fails. A sliding-window EM that averages M-step updates across consecutive frames improves stability.

Practical Constraints
  • Noise floor estimation requires at least M>2NhoM > 2N ho measurements (well-posed EM)

  • EM convergence typically requires 15–30 outer iterations for SNR > 10 dB

  • Interleaved EM adds < 5% overhead to GAMP runtime (M-step is O(N)O(N))

EM Objective Landscape: logp(y;ρ,σ2)\log p(\mathbf{y}; \rho, \sigma^2)

Contour plot of the approximate marginal log-likelihood as a function of sparsity ρ\rho and noise variance σ2\sigma^2, with the true parameter marked as a star. The landscape is generally unimodal at high SNR but develops spurious local maxima at low SNR.

Observe how the objective elongates into a ridge (the noise-sparsity tradeoff) as the SNR decreases.

Parameters
20

Quick Check

In EM-GAMP, the M-step update for ρ^\hat{\rho} (sparsity) is:

1Nix^i2\frac{1}{N}\sum_i \hat{x}_i^2

1Niπi\frac{1}{N}\sum_i \pi_i

x^0/N\|\hat{\mathbf{x}}\|_0 / N

1MyAx^2\frac{1}{M}\|\mathbf{y} - \mathbf{A}\hat{\mathbf{x}}\|^2

Key Takeaway

EM-GAMP wraps GAMP in an EM loop: the E-step runs GAMP to compute posterior means x^i\hat{x}_i, variances τx,i\tau_{x,i}, and inclusion probabilities πi\pi_i; the M-step updates σ2\sigma^2, ρ\rho, and σx2\sigma_x^2 in closed form. Interleaving (one GAMP iteration per EM step) is efficient and achieves near-oracle NMSE — typically within 0.1 dB — from arbitrary initialization, eliminating manual tuning entirely.

🎓CommIT Contribution(2014)

EM-Based Hyperparameter Estimation for RF Imaging

P. Schniter, S. Rangan, G. CaireIEEE Transactions on Signal Processing, vol. 63, no. 4, pp. 1043–1055

The parametric GAMP framework — of which EM-GAMP is the principal instance — was developed to provide consistent hyperparameter estimates alongside the signal estimate. The analysis shows that in the large-system limit (M,NM, N \to \infty with M/NδM/N \to \delta), the EM-GAMP parameter estimates converge to the true values whenever the true parameters are identifiable. This result provides the theoretical foundation for the self-tuning RF imaging pipeline described in this chapter.

em-gamphyperparameter-estimationrf-imagingsparse-bayesianView Paper →