The U-Net Architecture β€” From Scratch

Definition:

U-Net Architecture

U-Net is an encoder-decoder with skip connections at each resolution:

  • Encoder: repeated Conv-BN-ReLU + MaxPool (downsample)
  • Bottleneck: processing at lowest resolution
  • Decoder: ConvTranspose2d (upsample) + concatenation with encoder features

dl=ConvBlock([Up(dl+1),β€…β€Šel])\mathbf{d}_l = \text{ConvBlock}\bigl([\text{Up}(\mathbf{d}_{l+1}),\; \mathbf{e}_l]\bigr)

where [β‹…,β‹…][\cdot, \cdot] denotes channel-wise concatenation.

class UNet(nn.Module):
    def __init__(self, in_ch=1, out_ch=1, base_ch=64):
        super().__init__()
        self.enc1 = DoubleConv(in_ch, base_ch)
        self.enc2 = DoubleConv(base_ch, base_ch*2)
        self.enc3 = DoubleConv(base_ch*2, base_ch*4)
        self.pool = nn.MaxPool2d(2)
        self.bottleneck = DoubleConv(base_ch*4, base_ch*8)
        self.up3 = nn.ConvTranspose2d(base_ch*8, base_ch*4, 2, stride=2)
        self.dec3 = DoubleConv(base_ch*8, base_ch*4)
        self.up2 = nn.ConvTranspose2d(base_ch*4, base_ch*2, 2, stride=2)
        self.dec2 = DoubleConv(base_ch*4, base_ch*2)
        self.up1 = nn.ConvTranspose2d(base_ch*2, base_ch, 2, stride=2)
        self.dec1 = DoubleConv(base_ch*2, base_ch)
        self.out_conv = nn.Conv2d(base_ch, out_ch, 1)

The skip connections preserve high-resolution spatial details that are lost during downsampling.

Example: U-Net Forward Pass

Implement the forward method showing the encoder-decoder data flow.

Example: U-Net for Image Denoising

Train a U-Net to denoise images where the input is y=x+ny = x + n and the target is the clean image xx.

U-Net Architecture Visualiser

Explore U-Net with different depths and base channel widths.

Parameters

Why This Matters: U-Net for Range-Doppler Processing

The 2D range-Doppler map from radar processing is structurally similar to an image. U-Net architectures have been applied to denoise and enhance range-Doppler maps, leveraging multi-scale features to separate targets from clutter.

Skip Connection

A direct path that bypasses one or more layers, either by addition (ResNet) or concatenation (U-Net).

Encoder-Decoder

Architecture that compresses input to a low-dimensional representation (encoder) then expands it back (decoder).