Chapter Summary

Key Points

1.
Conv-BN-ReLU is the building block. Use 3x3 kernels with padding=1 to preserve dimensions, bias=False before BatchNorm, and Kaiming initialization. This block is the atom from which all CNN architectures are built.
2.
Residual connections enable depth. Skip connections provide a gradient highway that prevents vanishing gradients, enabling training of 100+ layer networks. The network learns the residual $\mathcal{F}(x)$ rather than the full mapping.
3.
U-Net combines multi-scale features. The encoder-decoder with skip connections preserves both high-level semantics (from deep layers) and fine spatial detail (from shallow layers). Essential for pixel-level tasks like denoising and segmentation.
4.
DnCNN/DRUNet demonstrate residual learning for denoising. Learning noise rather than the clean image is easier and more efficient. DRUNet's noise level map enables a single model to handle all noise levels.
5.
CNNs exploit translation equivariance and locality. Weight sharing across spatial positions reduces parameters by orders of magnitude compared to fully connected networks, while the inductive bias matches the structure of image and grid data.

Looking Ahead

Chapter 28 extends CNNs to complex-valued data for wireless applications. Chapter 30 introduces attention mechanisms that complement CNNs by capturing long-range dependencies.

DnCNN and DRUNet: Denoising Networks Exercises