DnCNN and DRUNet: Denoising Networks

Definition:
DnCNN: Denoising CNN

DnCNN learns the residual noise $\hat{n} = \text{DnCNN}(y)$ rather than the clean image directly:

$\hat{x} = y - \hat{n}, \qquad L = \|\hat{n} - (y - x)\|^2$

Architecture: Conv-ReLU + $(D-2)$ Conv-BN-ReLU + Conv, all 3x3 with 64 channels. Typically $D = 17$ layers.

class DnCNN(nn.Module):
    def __init__(self, channels=1, num_layers=17, features=64):
        super().__init__()
        layers = [nn.Conv2d(channels, features, 3, padding=1), nn.ReLU(True)]
        for _ in range(num_layers - 2):
            layers += [nn.Conv2d(features, features, 3, padding=1, bias=False),
                       nn.BatchNorm2d(features), nn.ReLU(True)]
        layers.append(nn.Conv2d(features, channels, 3, padding=1))
        self.net = nn.Sequential(*layers)

    def forward(self, y):
        return y - self.net(y)  # residual learning

Definition:
DRUNet: U-Net with Noise Level Map

DRUNet extends U-Net for blind denoising by concatenating a noise level map $\sigma \cdot \mathbf{1}$ to the input:

$\hat{x} = \text{DRUNet}([\mathbf{y},\; \sigma \cdot \mathbf{1}])$

This allows a single model to handle any noise level. Architecture: 4-level U-Net with residual blocks at each scale.

# Input: (B, C+1, H, W) where last channel is sigma map
noise_map = torch.full((B, 1, H, W), sigma)
x_input = torch.cat([noisy, noise_map], dim=1)
denoised = drunet(x_input)

DRUNet is the backbone of many plug-and-play (PnP) algorithms for inverse problems (see Chapter 33).

Example: Why Residual Learning Works for Denoising

Explain why learning $\hat{n}$ is easier than learning $\hat{x}$ .

Solution

Intuition

For low noise levels, the clean image $x$ is complex but the noise $n$ is simple (i.i.d. Gaussian). Learning a simple target is easier — the network just needs to identify what is NOT signal, rather than reconstruct the signal itself.

Gradient perspective

The residual formulation $\hat{x} = y - f(y)$ means the identity mapping (copying the input) is the default. The network only needs to learn the correction.

CNN Denoiser Comparison

Model	Architecture	Noise Handling	Parameters	Use Case
DnCNN	17-layer plain CNN	Fixed sigma	~556K	Simple known-noise denoising
DRUNet	U-Net + residual blocks	Any sigma (noise map input)	~32M	PnP algorithms, blind denoising
FFDNet	Downsampled + CNN	Sigma map input	~485K	Fast flexible denoising

Historical Note: DnCNN: CNN Meets Image Denoising

2017

Zhang et al. (2017) showed that a simple CNN with residual learning and BatchNorm could match or exceed traditional denoising methods (BM3D) that had been state-of-the-art for a decade. This paper demonstrated that deep learning could compete with hand-crafted algorithms in low-level vision.

The U-Net Architecture — From Scratch Chapter Summary