Chapter Summary
Chapter Summary
Key Points
- 1.
Deep Image Prior (DIP) uses the CNN architecture as an implicit prior, reconstructing from a single measurement without training data. Spectral bias causes low-frequency signal to be learned before high-frequency noise; early stopping acts as regularisation. The Deep Decoder eliminates the need for early stopping via under-parameterisation.
- 2.
Noise2Noise trains denoisers from noisy-noisy pairs (no clean data), converging to the MMSE estimator because the cross-term vanishes by independence. Noise2Void extends this to a single noisy image by exploiting pixel-independent noise, but fails for correlated noise (common after matched filtering in RF imaging).
- 3.
SURE provides an unbiased MSE estimate without clean targets, using the denoiser's divergence as a correction term computed via a single Monte Carlo probe vector. SURE-trained denoisers match supervised quality for Gaussian noise. GSURE extends to inverse problems but is blind to the null space.
- 4.
Equivariant imaging uses known signal symmetries (rotations, shifts, flips) as self-supervision, constraining the null space of the forward operator without ground truth. The equivariance loss creates "virtual measurements" that provide indirect observations of unmeasured subspace components.
- 5.
Foundation models provide general-purpose priors that can be adapted to RF imaging via LoRA fine-tuning or operator conditioning (RAM). The domain gap between natural images and RF scenes requires careful adaptation; Caire's vision of a simulation-pretrained RF foundation model connects physics-based modelling with data-driven reconstruction.
- 6.
Self-supervised methods form a spectrum of data requirements: DIP (one measurement), Noise2Void (one noisy image), SURE (noisy images with known noise), Noise2Noise (noisy pairs), EI (unpaired measurements + symmetries), foundation models (large pretraining + small adaptation). The choice depends on available data, noise statistics, and computational budget.
Looking Ahead
This chapter completes Part VI on learned reconstruction methods. The methods of Chapters 20--23 form a spectrum from fully supervised (Chapter 20) to fully unsupervised (this chapter), with increasing independence from training data but also increasing reliance on architectural priors and symmetry assumptions.
Part VII takes a fundamentally different perspective: rather than reconstructing images on a voxel grid, neural scene representations (NeRF, 3D Gaussian splatting, signed distance functions) model the 3D scene as a continuous function, enabling joint estimation of geometry, reflectivity, and material properties from RF measurements.