Exercises

ex22-01

Easy

Let p(x)=N(μ,Σ)p(\mathbf{x}) = \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma}). Compute the score function xlogp(x)\nabla_\mathbf{x}\log p(\mathbf{x}).

ex22-02

Easy

In the DDPM forward process, compute the signal-to-noise ratio SNR(t)=αˉt/(1αˉt)\text{SNR}(t) = \bar{\alpha}_t / (1 - \bar{\alpha}_t) for the linear schedule βt=βmin+(βmaxβmin)t1T1\beta_t = \beta_{\min} + (\beta_{\max}-\beta_{\min})\frac{t-1}{T-1} with βmin=104\beta_{\min} = 10^{-4}, βmax=0.02\beta_{\max} = 0.02, T=1000T = 1000. At which step tt does SNR(t)=1\text{SNR}(t) = 1 (0 dB)?

ex22-03

Easy

Verify Tweedie's formula for the case x0N(μ,σ02I)\mathbf{x}_0 \sim \mathcal{N}(\boldsymbol{\mu}, \sigma_0^2\mathbf{I}) (non-standard Gaussian). Show that x^0=E[x0xt]\hat{\mathbf{x}}_0 = \mathbb{E}[\mathbf{x}_0 \mid \mathbf{x}_t] matches the Tweedie prediction.

ex22-04

Easy

For DPS with guidance scale ζ\zeta and measurement noise variance σn2\sigma^2_{n}, what is the effective regularisation parameter in terms of the measurement residual? Compare with the proximal operator in PnP-ADMM (Chapter 21).

ex22-05

Easy

Name three advantages and three disadvantages of diffusion-based reconstruction compared to PnP methods for RF imaging.

ex22-06

Medium

Derive the DPS guidance gradient for the nonlinear forward model y=f(x0)+n\mathbf{y} = f(\mathbf{x}_0) + \mathbf{n} where nN(0,σn2I)\mathbf{n} \sim \mathcal{N}(\mathbf{0}, \sigma^2_{n}\mathbf{I}) and ff is a differentiable function.

ex22-07

Medium

Consider a 11D compressed sensing problem with ARm×n\mathbf{A} \in \mathbb{R}^{m \times n} where m=n/4m = n/4 (75% undersampling). The SVD gives r=mr = m nonzero singular values. What fraction of the reconstruction is determined by the measurements, and what fraction must be filled by the diffusion prior?

ex22-08

Medium

Derive the DDNM correction formula:

x^0DDNM=x^0A(Ax^0y)\hat{\mathbf{x}}_0^{\text{DDNM}} = \hat{\mathbf{x}}_0 - \mathbf{A}^\dagger(\mathbf{A}\hat{\mathbf{x}}_0 - \mathbf{y})

and show that it satisfies Ax^0DDNM=y\mathbf{A}\hat{\mathbf{x}}_0^{\text{DDNM}} = \mathbf{y} when A\mathbf{A} has full row rank.

ex22-09

Medium

Compare the per-step computational cost of DPS (with backpropagation) and DDRM (with SVD-based projection) for a forward model ARm×n\mathbf{A} \in \mathbb{R}^{m \times n}. Under what conditions is DDRM more efficient per step?

ex22-10

Medium

In DiffPIR, the proximal data step has the closed-form solution (for AHA+ρI\mathbf{A}^{H}\mathbf{A} + \rho\mathbf{I} invertible):

zk=(AHA+ρI)1(AHy+ρx^k).\mathbf{z}_k = (\mathbf{A}^{H}\mathbf{A} + \rho\mathbf{I})^{-1}(\mathbf{A}^{H}\mathbf{y} + \rho\hat{\mathbf{x}}_k).

Show that as ρ0\rho \to 0, this reduces to the pseudoinverse solution Ay\mathbf{A}^\dagger\mathbf{y}, and as ρ\rho \to \infty, it reduces to x^k\hat{\mathbf{x}}_k (pure prior).

ex22-11

Medium

A DDIM sampler with SS uniformly spaced steps uses the time subsequence τi=Ti/S\tau_i = T \cdot i/S for i=0,1,,Si = 0, 1, \ldots, S. For the linear schedule βt=βmin+(βmaxβmin)t1T1\beta_t = \beta_{\min} + (\beta_{\max}-\beta_{\min})\frac{t-1}{T-1}, show that each DDIM step covers approximately the same change in log-SNR ΔlogSNRlog(SNR(0)/SNR(T))/S\Delta\log\text{SNR} \approx -\log(\text{SNR}(0)/\text{SNR}(T))/S.

ex22-12

Medium

Show that the DPS guidance gradient for the noiseless case (σn20\sigma^2_{n} \to 0) becomes a hard projection:

limσn20ζσn2xt12Ax^0y2\lim_{\sigma^2_{n} \to 0} \frac{\zeta}{\sigma^2_{n}}\nabla_{\mathbf{x}_t}\frac{1}{2}\|\mathbf{A}\hat{\mathbf{x}}_0 - \mathbf{y}\|^2

diverges unless Ax^0=y\mathbf{A}\hat{\mathbf{x}}_0 = \mathbf{y} exactly. What does this imply for the choice of ζ\zeta at different noise levels?

ex22-13

Hard

Derive the Π\PiGDM likelihood approximation. Starting from the Gaussian approximation p(x0xt)N(x^0,rt2I)p(\mathbf{x}_0 \mid \mathbf{x}_t) \approx \mathcal{N}(\hat{\mathbf{x}}_0, r_t^2\mathbf{I}), show that the marginal likelihood is:

p(yxt)N(Ax^0,  σn2I+rt2AAH)p(\mathbf{y} \mid \mathbf{x}_t) \approx \mathcal{N}(\mathbf{A}\hat{\mathbf{x}}_0,\; \sigma^2_{n}\mathbf{I} + r_t^2\mathbf{A}\mathbf{A}^{H})

and derive the corresponding guidance gradient.

ex22-14

Hard

Prove that the DDRM reconstruction is measurement-consistent: Ax^=y\mathbf{A}\hat{\mathbf{x}} = \mathbf{y} for the noiseless case. Use the SVD decomposition A=UrΣrVrH\mathbf{A} = \mathbf{U}_r\boldsymbol{\Sigma}_r\mathbf{V}_r^H.

ex22-15

Hard

A radar system produces measurements y=Ac+w\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w} where ACm×n\mathbf{A} \in \mathbb{C}^{m \times n} with m=1000m = 1000, n=4000n = 4000, and rank(A)=1000\text{rank}(\mathbf{A}) = 1000. A DPS reconstruction with S=200S = 200 DDIM steps and a U-Net with 100M parameters takes 60 seconds. Design an acceleration strategy to bring this to under 5 seconds while maintaining measurement consistency.

ex22-16

Hard

Derive the relationship between the DPS guidance gradient and the MAP estimator. Show that in the limit of infinitely many diffusion steps (continuous time), DPS with deterministic (DDIM) sampling converges to a gradient descent on the MAP objective logp(x)+12σn2yAx2-\log p(\mathbf{x}) + \frac{1}{2\sigma^2_{n}}\|\mathbf{y} - \mathbf{A}\mathbf{x}\|^2.

ex22-17

Hard

For a complex-valued SAR scene cCn\mathbf{c} \in \mathbb{C}^n, a diffusion model is trained on the 2-channel representation [Re(c),Im(c)][\text{Re}(\mathbf{c}), \text{Im}(\mathbf{c})]. Show that applying DPS with the linear forward model y=Ac+w\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w} (complex-valued) is equivalent to DPS on a real-valued system of twice the dimension.

ex22-18

Challenge

Design a physics-constrained diffusion training scheme for the RF imaging forward model y=Ac+w\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w}. The training objective should combine DSM loss with a measurement consistency loss. Derive the gradient of the combined objective with respect to the network parameters θ\theta.

ex22-19

Challenge

Prove that the DDIM sampler is a first-order exponential integrator for the probability flow ODE. Start from the ODE:

dxdt=f(t)x+g(t)22σtϵθ(x,t)\frac{d\mathbf{x}}{dt} = f(t)\mathbf{x} + \frac{g(t)^2}{2\sigma_t}\boldsymbol{\epsilon}_\theta(\mathbf{x}, t)

and show that the DDIM update is the exact solution of this ODE with a piecewise-constant approximation of ϵθ\boldsymbol{\epsilon}_\theta.

ex22-20

Challenge

Consider using DPS for a non-Gaussian measurement model: yiPoisson(AiTx0)y_i \sim \text{Poisson}(\mathbf{A}_{i}^{T}\mathbf{x}_0) (photon-counting model, relevant for low-dose imaging). Derive the DPS guidance gradient and discuss the challenges compared to the Gaussian case.