Ferkans — Interactive Telecom Tutor

Deep Denoisers Transform PnP Performance

The advent of deep-learning denoisers (DnCNN, DRUNet, SwinIR) has dramatically improved PnP reconstruction quality. These networks capture complex image priors — textures, edges, semantic structure — that handcrafted priors (TV, wavelets, BM3D) cannot represent.

The central question is: how do we harness this expressive power while preserving the modularity and (ideally) the convergence guarantees of the PnP framework? This section reviews the three dominant architectures and the noise-scheduling strategy that bridges them to PnP.

Definition:
DnCNN — Residual Convolutional Denoiser

DnCNN (Zhang et al., 2017) trains a deep CNN $f_\theta$ to predict the noise residual rather than the clean image:

$\mathcal{D}_\sigma(\mathbf{v}) = \mathbf{v} - f_\theta(\mathbf{v}).$

Architecture: 17 convolutional layers with batch normalisation and ReLU. No pooling or skip connections; fixed receptive field.

Training: Minimise $\sum_i \|f_\theta(\mathbf{v}_i) - \mathbf{n}_i\|^2$ over noisy/clean image pairs at a fixed noise level $\sigma$ .

Limitation for PnP: One model per noise level. In PnP iterations where $\sigma$ varies, a separate model is needed per level or a noise-blind approximation is required.

Definition:
DRUNet — Noise-Level-Aware U-Net Denoiser

DRUNet (Zhang et al., 2021) conditions on the noise level $\sigma$ by concatenating a constant channel $\sigma \cdot \mathbf{1}$ to the input:

$\hat{\mathbf{x}} = \mathcal{D}_\theta(\mathbf{v},\, \sigma).$

Architecture: Encoder–decoder U-Net with residual blocks at four scales ( $[64, 128, 256, 512]$ channels).

Training: Diverse natural images with noise levels $\sigma \in [0, 50]/255$ drawn uniformly.

PnP advantage: The noise-level input allows the denoiser strength to be adjusted at each iteration without retraining. In PnP-ADMM, $\sigma$ decreases across iterations as the estimate converges, implementing a coarse-to-fine strategy.

DRUNet achieves state-of-the-art denoising PSNR across all noise levels with a single model. Its noise-level conditioning makes it the default choice for PnP in practice.

Definition:
SwinIR — Transformer-Based Denoiser

SwinIR (Liang et al., 2021) uses Swin Transformer blocks with shifted-window self-attention instead of convolutions. The key architectural difference from DRUNet:

Non-local receptive field: attention over large windows captures long-range dependencies across the image
Shift invariance: unlike CNNs, the receptive field adapts to image content
PnP application: SwinIR achieves $\sim 0.2$ – $0.3$ dB improvement over DRUNet at the cost of higher computational complexity per inference

For RF imaging applications, SwinIR's long-range attention is particularly beneficial for capturing the structured sparsity of RF scenes.

Example: Noise-Level Schedule for PnP-ADMM with DRUNet

Design a noise-level schedule $\{\sigma_k\}_{k=1}^K$ for PnP-ADMM with DRUNet, where the denoiser strength decreases across iterations. Compare geometric and cosine schedules.

Solution

Motivation for a decreasing schedule

In early iterations, the estimate $\mathbf{c}^{(k)}$ is far from the truth — the effective residual error is large, so the denoiser should be strong (large $\sigma$ , heavy smoothing). As iterations progress, the estimate refines and the denoiser should be gentle (small $\sigma$ ) to preserve fine-scale features.

Geometric schedule

$\sigma_k = \sigma_1 \cdot \gamma^{k-1}, \qquad k = 1, \ldots, K$ $with$ \gamma = (\sigma_K/\sigma_1)^{1/(K-1)} $. **Example:**$ \sigma_1 = 49/255 $,$ \sigma_K = 1/255 $,$ K = 30 $:$ \gamma = (1/49)^{1/29} \approx 0.875$. The schedule decreases rapidly early and slowly at the end.

Cosine schedule

$\sigma_k = \sigma_K + \frac{\sigma_1 - \sigma_K}{2} \!\left(1 + \cos\!\left(\frac{\pi(k-1)}{K-1}\right)\right).$ $This decreases slowly at both endpoints and rapidly in the middle. For scenes with structured artefacts (common in RF imaging), the cosine schedule is often preferred as it spends more iterations at intermediate noise levels.$ \square$

Theorem: The Implicit Regulariser of a Deep Denoiser

If a denoiser $\mathcal{D}_\sigma$ is the gradient of a scalar potential, i.e., $\mathcal{D}_\sigma(\mathbf{v}) = \nabla \Phi_\sigma(\mathbf{v})$ for some $\Phi_\sigma \colon \mathbb{R}^N \to \mathbb{R}$ , then PnP-PGD minimises the explicit objective:

$\min_\mathbf{x} \; \frac{1}{2}\|\mathbf{y} - \mathbf{A}\mathbf{x}\|^2 + R_\sigma(\mathbf{x}),$

where $R_\sigma(\mathbf{x}) = \frac{1}{2}\|\mathbf{x}\|^2 - \Phi_\sigma(\mathbf{x})$ is the implicit regulariser induced by $\mathcal{D}_\sigma$ .

The denoiser defines a "score function" $\nabla\Phi_\sigma$ that points toward regions of high image probability. The regulariser $R_\sigma$ penalises images far from the denoiser's implicit manifold of natural images. Without the gradient-potential property, PnP may not be minimising any explicit objective — it is a fixed-point iteration whose limit depends on initialisation and the denoiser's geometry.

Proof

PnP-PGD fixed-point condition

At a fixed point $\mathbf{x}^*$ of PnP-PGD with step size $\alpha$ : $\mathbf{x}^* = \mathcal{D}_\sigma(\mathbf{x}^* - \alpha\nabla f(\mathbf{x}^*)),$ where $f(\mathbf{x}) = \frac{1}{2}\|\mathbf{y} - \mathbf{A}\mathbf{x}\|^2$ .

Identify the regulariser via Tweedie's formula

When $\mathcal{D}_\sigma(\mathbf{v}) = \mathbf{v} - \sigma^2\nabla R_\sigma(\mathbf{v})$ (Tweedie's formula for score-based denoisers): $\mathbf{x}^* - \sigma^2\nabla R_\sigma(\mathbf{x}^* - \alpha\nabla f(\mathbf{x}^*)) = \mathbf{x}^* - \alpha\nabla f(\mathbf{x}^*) - \sigma^2\nabla R_\sigma(\cdots).$ For small $\alpha$ , the fixed-point condition reduces to: $\alpha\nabla f(\mathbf{x}^*) + \sigma^2\nabla R_\sigma(\mathbf{x}^*) \approx 0,$ which is the stationarity condition of $f + (\sigma^2/\alpha) R_\sigma$ . $\blacksquare$

,

Historical Note: Tweedie's Formula and the Score Function Connection

1950s–present

Tweedie's formula, derived by Maurice Tweedie in the 1950s and popularised by Robbins (1956) in empirical Bayes, states that the MMSE estimate of $\mathbf{x}$ from $\mathbf{v} = \mathbf{x} + \sigma\mathbf{n}$ is: $\mathbb{E}[\mathbf{x} | \mathbf{v}] = \mathbf{v} + \sigma^2 \nabla_\mathbf{v} \log p_\sigma(\mathbf{v}).$ This connects the MMSE denoiser to the score function $\nabla\log p_\sigma$ . In diffusion models (Chapter 22), this identity is central to the reverse process. In PnP, it implies that a good denoiser approximates the score function of the image distribution, giving PnP a principled Bayesian interpretation even without an explicit prior.

Why This Matters: PnP for RF Imaging

PnP is particularly attractive for RF imaging because:

Modular forward models: The sensing matrix $\mathbf{A}$ changes with array configuration, frequency, and scene geometry. PnP allows the same denoiser to be reused across all configurations.
Complex-valued signals: RF images are complex (magnitude + phase). DRUNet can handle complex inputs by treating real and imaginary parts as separate channels.
Limited training data: PnP denoisers are trained on natural images, which are abundant. This sidesteps the data-scarcity problem in RF imaging without requiring domain-specific training.
Interpretability: Each PnP iteration has a clear physical meaning (enforce measurement consistency, then denoise), which aids debugging and validation.

See full treatment in PnP and RED for RF Imaging

DnCNN vs DRUNet vs BM3D Denoising Quality

Compare the denoising quality of BM3D, DnCNN, and DRUNet at various noise levels on a test image. The plot shows PSNR (dB) vs noise level $\sigma$ for each denoiser. Observe that DRUNet consistently outperforms the others, especially at low noise levels where fine-detail preservation is critical.

Parameters

Noise level

\sigma/255

25

Denoiser

Quick Check

For a PnP algorithm to be provably minimising an explicit objective function, what property must the denoiser satisfy?

The denoiser must be a convolutional neural network.

The denoiser must be the gradient of a scalar potential function.

The denoiser must be non-expansive.

The denoiser must be trained end-to-end with the forward model.

Correction:

The denoiser must be the gradient of a scalar potential function.

If $\mathcal{D}_\sigma = \nabla\Phi_\sigma$ , the PnP fixed point is the stationary point of $f(\mathbf{x}) + R_\sigma(\mathbf{x})$ where $R_\sigma$ is the implicit regulariser. Without this property, PnP may not be minimising any objective.

Key Takeaway

DRUNet with noise-level conditioning is the recommended denoiser for PnP, enabling adaptive denoising strength via a decreasing noise schedule. When the denoiser is a gradient potential, PnP minimises an explicit implicit-regulariser objective; otherwise, it remains a powerful empirical algorithm whose fixed points lack a clean variational interpretation.

Deep Denoisers as Implicit Priors

Deep Denoisers Transform PnP Performance

Definition: DnCNN — Residual Convolutional Denoiser

Definition: DRUNet — Noise-Level-Aware U-Net Denoiser

Definition: SwinIR — Transformer-Based Denoiser

Example: Noise-Level Schedule for PnP-ADMM with DRUNet

Motivation for a decreasing schedule

Geometric schedule

Cosine schedule

Theorem: The Implicit Regulariser of a Deep Denoiser

PnP-PGD fixed-point condition

Identify the regulariser via Tweedie's formula

Historical Note: Tweedie's Formula and the Score Function Connection

Why This Matters: PnP for RF Imaging

DnCNN vs DRUNet vs BM3D Denoising Quality

Parameters

Quick Check

Key Takeaway

Definition:
DnCNN — Residual Convolutional Denoiser

Definition:
DRUNet — Noise-Level-Aware U-Net Denoiser

Definition:
SwinIR — Transformer-Based Denoiser