Chapter Summary

Chapter 3 Summary: Bayesian Inverse Problems

Key Points

  • 1.

    Bayes' theorem combines the likelihood p(yγ)p(\mathbf{y} \mid \boldsymbol{\gamma}) and prior π(γ)\pi(\boldsymbol{\gamma}) to yield the full posterior p(γy)p(\boldsymbol{\gamma} \mid \mathbf{y}) — the complete solution to the Bayesian inverse problem. MAP equals variational regularization; MMSE equals the posterior mean.

  • 2.

    For Gaussian priors γN(0,Γ)\boldsymbol{\gamma} \sim \mathcal{N}(\mathbf{0}, \mathbf{\Gamma}) with Gaussian noise, the posterior is Gaussian with mean (σ2AHA+Γ1)1σ2AHy(\sigma^{-2}\mathbf{A}^H\mathbf{A} + \mathbf{\Gamma}^{-1})^{-1}\sigma^{-2}\mathbf{A}^H\mathbf{y} — exactly the Tikhonov solution with λ=σ2/γ2\lambda = \sigma^2/\gamma^2 having a clear probabilistic interpretation as the noise-to-signal variance ratio.

  • 3.

    Sparsity-promoting priors — Laplace (yields LASSO MAP), Bernoulli-Gaussian (standard RF scene model), spike-and-slab (ideal but intractable), and horseshoe (adaptive, near-minimax) — assign high prior probability to sparse scenes and produce sparser reconstructions than Gaussian priors.

  • 4.

    Sparse Bayesian Learning (SBL) uses ARD per-component precisions {αi}\{\alpha_i\} updated via EM. The algorithm automatically drives irrelevant αi\alpha_i \to \infty (pruning those components) without pre-specifying sparsity level kk. SBL provides posterior uncertainty on the active support that LASSO cannot.

  • 5.

    Gaussian measures on Hilbert spaces provide discretization-invariant priors via trace-class covariance operators. The Cameron-Martin theorem characterizes which MAP estimates live in function space. Stuart's theorem guarantees existence, uniqueness, and stability of the posterior as a measure — the Bayesian analogue of Tikhonov well-posedness.

  • 6.

    Posterior credible intervals and variance maps quantify reconstruction confidence. The posterior contracts at the minimax-optimal rate O(σ2β/(2β+1))O(\sigma^{2\beta/(2\beta+1)}) when the prior regularity matches the truth.

  • 7.

    The pCN sampler achieves dimension-independent MCMC acceptance rates by proposing in the Cameron-Martin space. HMC scales as O(n1/4)O(n^{1/4}) using gradients. Calibration checks (empirical coverage vs nominal level) are mandatory before deploying UQ in safety-critical applications.

Looking Ahead

Chapter 4 turns from probabilistic formulations to efficient algorithms: given the Bayesian and variational problems formulated in Chapters 2-3, how do we compute MAP estimates efficiently for n105n \sim 10^5 unknowns?

  • Fast algorithms for structured operators — the NUFFT, Kronecker products, and FFT-based convolutions that make imaging-scale optimization tractable.
  • GPU acceleration — CuPy and PyTorch forward/adjoint implementations enabling ISTA, FISTA, and ADMM at scale.
  • Automatic differentiation — computing Jacobians through iterative solvers, enabling gradient-based hyperparameter optimization and the deep-unfolding algorithms of §Computed Tomography: The Canonical Inverse Problem.
  • Convergence diagnostics — primal residuals, dual residuals, the discrepancy principle, and warm-starting strategies for practical imaging solvers.