Diffusion Posterior Sampling (DPS)
From Priors to Posteriors: Guiding Diffusion with Measurements
Diffusion models learn the prior . For inverse problems we need the posterior . By Bayes' rule:
Diffusion Posterior Sampling (DPS) modifies the reverse diffusion process to incorporate the likelihood , steering the generative trajectory toward images that explain the measurements. The result is (approximate) posterior sampling: each run of DPS with a different noise seed produces a different plausible reconstruction, enabling uncertainty quantification.
Definition: Posterior Score Decomposition
Posterior Score Decomposition
At diffusion time , the posterior score decomposes as:
The prior score is provided by the pretrained score network .
The likelihood score is intractable because involves marginalising over the unknown . Different methods (DPS, DDRM, MCG) differ in how they approximate this intractable term.
Definition: Diffusion Posterior Sampling (DPS)
Diffusion Posterior Sampling (DPS)
DPS approximates the likelihood score using the Tweedie estimate as a plug-in for . The modified reverse step is:
where:
- is the Tweedie estimate
- is the forward model (sensing matrix)
- is the measurement noise variance
- is the guidance scale
- The gradient is computed via automatic differentiation through the Tweedie estimate
The DPS approximation replaces the intractable marginal likelihood with the point-estimate likelihood . This is exact only when the posterior is concentrated (low noise), and becomes increasingly approximate at high noise levels.
Theorem: DPS Likelihood Guidance Gradient
For the linear Gaussian model with , the DPS guidance gradient is:
The Jacobian involves the score network's Jacobian, computed via backpropagation.
The gradient pushes in the direction that reduces the measurement residual . The chain rule through the Tweedie estimate ensures the correction is applied at the appropriate noise level.
Gaussian log-likelihood
$
Chain rule through Tweedie
$
DPS Algorithm for Linear Inverse Problems
Complexity: where includes the backpropagation costEach iteration requires one score network evaluation (line 3) plus one backpropagation through the network (line 6), giving total NFEs. For , this is the dominant computational cost.
DPS Reconstruction Trajectory
Visualise the DPS reconstruction as a function of the diffusion step. The plot shows the evolving Tweedie estimate at several intermediate times, from pure noise () to the final reconstruction (). Adjust the guidance scale : too small yields measurement-inconsistent samples; too large introduces artefacts from over-fitting to the measurements.
Parameters
Example: Effect of the Guidance Scale
Consider a 1D deblurring problem with Gaussian blur kernel of width pixels and measurement noise . Describe the effect of the guidance scale on the DPS reconstruction.
$\zeta = 0$ (no guidance)
Without guidance, the reverse diffusion generates a sample from the prior — a plausible image that bears no relationship to the measurements. The measurement residual is large.
$\zeta = 1$ (moderate guidance)
The guidance term steers the diffusion toward measurement-consistent images. The reconstruction balances prior fidelity (natural-looking) with data fidelity (explaining the measurements). This is the typical operating point.
$\zeta \gg 1$ (strong guidance)
Excessive guidance forces at every step, disrupting the diffusion process. The reconstruction over-fits to the noisy measurements, producing artefacts and losing the regularising effect of the prior. In the limit , DPS approaches the pseudoinverse solution.
Common Mistake: DPS Is Approximate Posterior Sampling
Mistake:
Claiming that DPS produces exact samples from the posterior .
Correction:
DPS makes two approximations:
- The score network is only approximate (finite training).
- The likelihood gradient uses the point estimate rather than marginalising over .
These approximations mean DPS samples are from an approximate posterior. The guidance scale compensates: larger enforces stronger measurement consistency at the expense of prior fidelity. No theoretical guarantee exists for the quality of this approximation.
Quick Check
For DPS with diffusion steps, approximately how many score network evaluations (NFEs) are required per reconstruction?
500
1000
2000
100
Each step requires one forward pass (score evaluation) and one backward pass (backpropagation for the guidance gradient), giving NFEs. This is the dominant computational cost.
Diffusion Posterior Sampling (DPS)
A method for solving inverse problems with pretrained diffusion models by adding a likelihood guidance gradient to the reverse diffusion process. The guidance gradient is computed via the Tweedie estimate and backpropagation through the score network.
Related: Tweedie Formula, Posterior Sampling
Guidance Scale
A hyperparameter that controls the strength of the measurement consistency term in DPS. Larger produces more measurement-consistent but potentially less natural reconstructions.
Related: Diffusion Posterior Sampling (DPS), Measurement Consistency
Key Takeaway
DPS modifies the reverse diffusion process with a likelihood guidance term computed via the Tweedie estimate. The guidance scale controls the tradeoff between prior fidelity and measurement consistency. The main limitation is computational cost: NFEs per sample, with typically in the hundreds. The main advantage is that multiple runs produce diverse posterior samples, enabling uncertainty quantification — a capability unavailable in deterministic methods.