Chapter Summary

Chapter Summary

Key Points

  • 1.

    Deep Image Prior (DIP) uses the CNN architecture as an implicit prior, reconstructing from a single measurement without training data. Spectral bias causes low-frequency signal to be learned before high-frequency noise; early stopping acts as regularisation. The Deep Decoder eliminates the need for early stopping via under-parameterisation.

  • 2.

    Noise2Noise trains denoisers from noisy-noisy pairs (no clean data), converging to the MMSE estimator because the cross-term vanishes by independence. Noise2Void extends this to a single noisy image by exploiting pixel-independent noise, but fails for correlated noise (common after matched filtering in RF imaging).

  • 3.

    SURE provides an unbiased MSE estimate without clean targets, using the denoiser's divergence as a correction term computed via a single Monte Carlo probe vector. SURE-trained denoisers match supervised quality for Gaussian noise. GSURE extends to inverse problems but is blind to the null space.

  • 4.

    Equivariant imaging uses known signal symmetries (rotations, shifts, flips) as self-supervision, constraining the null space of the forward operator without ground truth. The equivariance loss creates "virtual measurements" that provide indirect observations of unmeasured subspace components.

  • 5.

    Foundation models provide general-purpose priors that can be adapted to RF imaging via LoRA fine-tuning or operator conditioning (RAM). The domain gap between natural images and RF scenes requires careful adaptation; Caire's vision of a simulation-pretrained RF foundation model connects physics-based modelling with data-driven reconstruction.

  • 6.

    Self-supervised methods form a spectrum of data requirements: DIP (one measurement), Noise2Void (one noisy image), SURE (noisy images with known noise), Noise2Noise (noisy pairs), EI (unpaired measurements + symmetries), foundation models (large pretraining + small adaptation). The choice depends on available data, noise statistics, and computational budget.

Looking Ahead

This chapter completes Part VI on learned reconstruction methods. The methods of Chapters 20--23 form a spectrum from fully supervised (Chapter 20) to fully unsupervised (this chapter), with increasing independence from training data but also increasing reliance on architectural priors and symmetry assumptions.

Part VII takes a fundamentally different perspective: rather than reconstructing images on a voxel grid, neural scene representations (NeRF, 3D Gaussian splatting, signed distance functions) model the 3D scene as a continuous function, enabling joint estimation of geometry, reflectivity, and material properties from RF measurements.