RED: Regularization by Denoising
RED: Closing the Gap Between PnP and Variational Methods
PnP algorithms are powerful but lack an explicit objective function in general β it is unclear what they are minimising. This makes convergence analysis difficult and prevents the use of standard optimisation tools.
Regularization by Denoising (RED) (Romano, Elad, Milanfar, 2017) bridges this gap by constructing an explicit regulariser directly from the denoiser. RED converts the denoiser into a concrete penalty term, enabling standard gradient-descent convergence theory to apply.
Definition: Regularization by Denoising (RED)
Regularization by Denoising (RED)
The RED regulariser derived from a denoiser is:
Under the Jacobian symmetry assumption (), its gradient simplifies to:
RED solves the variational problem:
Unlike PnP, RED defines an explicit objective. This makes it amenable to standard optimisation theory: fixed points are stationary points of a well-defined objective, and convergence can be analysed with standard tools.
Theorem: RED Gradient Under Jacobian Symmetry
If has a symmetric Jacobian () and satisfies local homogeneity (), then:
The RED gradient descent update is:
The RED update combines two gradient corrections:
- : move toward data consistency
- : move toward the denoiser output
The denoiser residual points from the current estimate away from the clean image manifold, so subtracting it pushes the iterate back toward the manifold.
Compute the full gradient
$
Apply Jacobian symmetry and local homogeneity
If and :
Definition: Score-Based Interpretation of RED
Score-Based Interpretation of RED
The RED gradient has a natural score-function interpretation via Tweedie's formula.
For a denoising model , the MMSE denoiser satisfies:
Evaluated at (clean image):
The RED gradient is thus proportional to the negative score function of the (slightly blurred) image distribution, pointing away from high-probability regions of the prior. RED gradient descent pushes iterates toward higher probability under the implicit prior.
Historical Note: Impact and Limitations of RED
2017βpresentRomano, Elad, and Milanfar introduced RED in SIAM Journal on Imaging Sciences (2017), framing it as a principled framework for converting any denoiser into a regulariser. The paper generated significant excitement for providing an explicit objective β something PnP lacked.
However, Reehorst and Schniter (2019) showed that the Jacobian symmetry assumption is rarely satisfied for deep denoisers, meaning the RED gradient formula is approximate. The exact gradient involves the Jacobian , which is expensive to compute. Despite this, RED continues to be used as an effective algorithm (with monitored convergence) even when the theoretical conditions are not met.
Example: RED vs. PnP-PGD: Comparison
Compare the per-iteration computational cost, theoretical properties, and step-size conditions of RED gradient descent and PnP-PGD.
Per-iteration cost
PnP-PGD: One gradient step + one denoiser call = .
RED-GD: One gradient step + one denoiser call = identical cost.
The per-iteration computational cost is exactly the same.
Theoretical properties
| Property | PnP-PGD | RED-GD |
|---|---|---|
| Explicit objective | Only if | Yes (if Jacobian symmetric) |
| Convergence guarantee | Non-expansive | Convex |
| Fixed point type | Operator fixed point | Stationary point of objective |
| Step-size condition |
Practical recommendation
Use PnP-ADMM when the data-fidelity subproblem has an efficient closed-form solver (Fourier operators). Use RED-GD when you want an explicit objective for monitoring and when the Jacobian symmetry assumption is approximately satisfied (e.g., linear denoisers, NLM).
Common Mistake: Deep Denoisers Rarely Have Symmetric Jacobians
Mistake:
Assuming that a DnCNN or DRUNet denoiser satisfies the Jacobian symmetry condition required for the RED gradient to be exact.
Correction:
Most deep denoisers (DnCNN, DRUNet, SwinIR) do not have symmetric Jacobians. This means:
- The RED "gradient" may not be the gradient of any scalar function.
- The algorithm may not be minimising the RED objective.
- Convergence is not guaranteed by standard gradient descent theory.
Mitigations:
- Use architecturally-enforced symmetric denoisers (e.g., spectral-normalised)
- Accept the approximation and monitor convergence empirically
- Use gradient-step denoisers (Section 21.3) for a principled alternative
RED Gradient Visualisation
Visualise the RED gradient for a 1-D signal with varying noise level . The gradient field shows which direction the RED update pushes the signal at each point.
Observe that the gradient is small where the signal is smooth (the denoiser makes little change) and large where the signal has noise-like fluctuations (the denoiser makes large corrections).
Parameters
Quick Check
Under what conditions does the RED gradient simplify to ?
When the denoiser is a CNN
When the denoiser Jacobian is symmetric and locally homogeneous
When the denoiser is non-expansive
Always, regardless of denoiser type
The full gradient is . Jacobian symmetry and local homogeneity () cause the two remaining terms to equal .
Key Takeaway
RED defines the explicit regulariser and its gradient (under Jacobian symmetry). RED has the same per-iteration cost as PnP-PGD but provides an explicit objective. In practice, the Jacobian symmetry assumption is approximately satisfied for many denoisers, and RED is used as an effective algorithm even when the exact conditions fail.