Exercises
ex20-01-mf-decomposition
EasyGiven a sensing matrix and measurements , write down the matched-filter image and show that it decomposes into a signal term and a noise term. Identify the Gram matrix and characterise the covariance of the noise term.
Apply to both sides of .
Identify the Gram matrix .
Compute where .
Compute the matched-filter image
$
Characterise the noise covariance
For white noise :
The back-projected noise is coloured with the same correlation structure as the PSF (Gram matrix). This is the key difficulty: noise and sidelobe artefacts share the same spatial pattern.
ex20-02-dc-idempotent
EasyVerify that the hard data-consistency layer is idempotent when , i.e., .
Let and show .
Then compute using the result above.
Show measurement consistency
Apply to the DC output:
Apply DC a second time
Let . We showed . Therefore:
ex20-03-loss-estimator
EasyA post-processing network is trained with the MSE loss .
(a) State the optimal network in the limit of infinite data and network capacity.
(b) For a 1D binary scene with equal probability and measurement with , compute .
MSE is minimised pointwise by the conditional mean.
Use Bayes theorem: .
Optimal MSE network is the posterior mean
For any fixed , the MSE decomposes as . The second term is irreducible, so .
Compute the posterior mean for binary scene
f^(y) = \mathbb{E}[c \mid y] = P(c=1|y) - P(c=-1|y) = \tanh(y/\sigma^2)|y| \ll \sigmaf^(y) \approx 0$, demonstrating the blurring effect β the network outputs a value that never occurs in the true scene.
ex20-04-gram-random
MediumLet have i.i.d. entries . Show that:
(a) for all . (b) for . (c) for .
Conclude that for large and state the implication for MFβU-Net performance.
. Use independence of the entries.
For , each summand has zero mean by independence.
Diagonal entries
. Each has mean for . By linearity, .
Off-diagonal mean
For : . Since and are independent for : . Hence .
Off-diagonal variance
.
As : off-diagonal entries in probability, so . The MF image satisfies β the U-Net task reduces to simple denoising.
ex20-05-dc-mri
MediumIn MRI, the sensing operator is where is the DFT matrix and selects rows.
(a) Show that (orthonormal rows). (b) Write the explicit form of the hard DC layer for this operator. (c) Interpret the DC layer as a k-space replacement operation.
Use the fact that for the DFT matrix, with appropriate normalisation.
The DC layer replaces acquired k-space samples while preserving the network prediction at unacquired locations.
Verify orthonormal rows
Normalise so (unitary DFT). Then: .
Write the DC layer
$
k-space interpretation
Let (DFT of network estimate). The DC layer sets: Acquired k-space samples are replaced by the measurements; unacquired locations are filled by the network. The inverse DFT then gives the final image.
ex20-06-modl-cg
MediumThe MoDL data-consistency step solves .
(a) Show that this is the solution to the regularised least-squares problem .
(b) For , what does the solution approach? For , what does it approach?
Differentiate the objective with respect to and set to zero.
For : data fidelity dominates. For : CNN output dominates.
Derive the normal equations
The objective is . Taking the gradient and setting to zero: Solving: .
Limiting cases
: (pseudoinverse β pure data fidelity, ignores denoiser).
: Divide by : (pure denoiser output, ignores measurements).
The step size controls the data-vs-prior balance.
ex20-07-perceptual-pseudo-metric
MediumShow that the perceptual loss is not a true metric on the image space by providing a counterexample where but . Explain the practical implication for RF imaging.
Consider the null space of the VGG feature extractor .
Deep networks are not injective: different inputs can produce identical features.
Identify the null space
The VGG feature extractor maps images to feature maps via convolutions and ReLU activations. This mapping is not injective. Add a high-frequency perturbation with spatial frequency above VGG's sensitivity (e.g., alternating checkerboard). For small , the pooling layers in VGG average out this pattern: for all selected layers.
Counterexample
Set where is a high-frequency checkerboard pattern invisible to VGG. Then but . The perceptual loss is a pseudo-metric (violates identity of indiscernibles).
Implication for RF imaging
This means perceptual loss alone cannot guarantee pixel-level fidelity. High-frequency target signatures (e.g., the precise location of a point reflector to sub-pixel accuracy) may be corrupted without any perceptual loss penalty. In RF imaging, always combine perceptual loss with a pixel-wise term (MSE or ).
ex20-08-transfer-bound
MediumA U-Net is trained on data from sensing matrix with Gram matrix . At deployment, the sensing matrix is with Gram matrix . Derive an upper bound on the deployment reconstruction error in terms of the training error and . Assume is Lipschitz with constant .
The MF image from is .
Use the triangle inequality to split into transfer error + training error.
Apply the Lipschitz property to bound the transfer error.
Split the error
$
Bound the transfer error
By Lipschitz continuity of :
Final bound
|\mathbf{G}_1 - \mathbf{G}_2|\mathbf{G} \approx \mathbf{I}\square$
ex20-09-sidelobe-correlation
HardConsider a scene with a single strong reflector at position : with . The matched-filter image is .
(a) Write the expression for the sidelobe at pixel in .
(b) Show that the covariance between the sidelobe at pixel and the back-projected noise at pixel is .
(c) Explain why this correlation makes it impossible for the U-Net to distinguish sidelobes from real features at pixel using alone.
The sidelobe at pixel is . The back-projected noise at pixel is .
Compute using .
A classifier using only cannot determine whether the signal at is a sidelobe or a real target.
Sidelobe at pixel j
The sidelobe is β it exists even for .
Covariance between sidelobe and noise
The back-projected noise has covariance . The signal at due to the point target has energy . The covariance between noise at pixel and the sidelobe is: whenever and .
More precisely, the noise at pixel is . The sidelobe contribution from target to pixel is . These share the same factor , making them statistically dependent.
Why the U-Net cannot distinguish them
Given only , the U-Net observes the sum . These two terms are correlated because they both involve . A feature at pixel could be: (a) a real target with amplitude , or (b) noise that looks like a target. Without additional information about the sensing geometry and the true scene at , the U-Net cannot differentiate these cases from alone.
This is the sidelobe corruption problem: structured sensing operators create correlated noise and signal components that are statistically inseparable from a single MF image.
ex20-10-modl-convergence
HardConsider the MoDL iteration with a fixed denoiser for a convex regulariser . Show that the MoDL iteration:
is equivalent to a proximal gradient step, and give conditions under which the iteration converges to the minimiser of .
The CG solve is the proximal operator of .
Rewrite the iteration as a proximal-proximal splitting step.
Convergence requires to be convex and the step size to be chosen appropriately.
Identify the proximal operator
The CG step solves where .
This is equivalent to:
Convergence conditions
This is a proximal-proximal splitting (alternating proximal algorithm). Convergence to the minimiser of holds when:
- is convex and lower semicontinuous.
- The regularisation weight satisfies where (spectral norm squared).
- The step size is bounded above by twice the smallest eigenvalue of in the relevant subspace.
ex20-11-unet-receptive-field
HardA U-Net for post-processing has encoder levels, each performing downsampling followed by two convolutions. Derive the receptive field diameter . For , what is the maximum sidelobe range that can be suppressed by this network, and what does this imply for long-range sidelobes in SAR?
At level , one pixel covers pixels at the original resolution.
Each convolution at level adds pixels to the receptive field.
Solve the recursion with .
Base case ($ll = 0$)
At the input resolution, two convolutions give pixels.
Recursive formula
At level (after downsampling steps), each pixel covers pixels at the original scale. Two convolutions at level add pixels to the receptive field:
Solve the recursion
L = 4r_4 = 4 \cdot 32 - 3 = 125L\square$
ex20-12-physics-channel-benefit
HardConsider a linear Gaussian model: and .
(a) Compute the MMSE estimator and its MSE when is known.
(b) Compute the MMSE estimator and its MSE when both and are given.
(c) Show that the informed estimator achieves lower MSE whenever for any scalar .
For the Gaussian linear model, the MMSE estimator is the Wiener filter.
The conditional covariance given uses the Gram matrix structure.
When varies spatially (non-constant diagonal), conditioning on it provides per-pixel SNR information.
MMSE without PSF knowledge
Given with and , the Wiener filter is:
MSE is .
MMSE with PSF knowledge
When is provided, the per-pixel effective SNR is known. For a diagonal (shift-invariant approximation), the informed Wiener filter decomposes per-pixel: MSE is .
Comparison
For the blind estimator, the Wiener weights are the same for all pixels (assuming uniform ). For the informed estimator, pixels with large (high SNR) are relied upon more; pixels with small (low SNR, i.e., dark side of the beam) are trusted less. When has non-constant diagonal (physically structured operators), the informed estimator achieves strictly lower MSE.
ex20-13-modl-optimal-lambda
ChallengeIn MoDL with a linear denoiser (e.g., a linear Wiener denoiser), find the optimal step size that minimises the one-step reconstruction MSE
where is a noisy initial estimate with . Express in terms of the eigenvalues of and .
Work in the eigenbasis of .
For each eigenvalue of , the one-step MoDL update is scalar.
Minimise the per-eigenmode MSE to find the optimal and then argue for a single global .
Diagonalise in the eigenbasis of G
Let with eigenvalues . In this basis, the MoDL update is scalar per mode : where , is the -th diagonal of , and .
Per-mode MSE
The MSE for mode is:
Optimal lambda
Differentiating with respect to and setting to zero gives a mode-specific optimal . For a single global , minimise the total MSE β a scalar optimisation problem that depends on the spectrum of , the denoiser , and the noise levels .
The result confirms that learning is strictly better than fixing it: the optimal value depends on the spectral content of the current estimate, which changes across MoDL iterations.
ex20-14-gan-hallucination
ChallengeConstruct a formal example showing that a GAN-trained reconstruction network can produce a hallucinated target. Consider a two-pixel scene with equal probability and a single measurement with .
(a) Show that the measurement provides no information about which pixel is active. (b) Compare the MSE-trained and GAN-trained network outputs. (c) Explain which is more dangerous for radar target detection and why.
Both scenes produce the same measurement distribution: .
MSE estimator: posterior mean over both modes. GAN: samples one mode at random.
Consider the false alarm rate for each estimator.
Measurement is uninformative
For both and : . The likelihood is identical for both scenes. By Bayes' theorem, for each.
MSE vs. GAN outputs
- MSE: .
- GAN: Samples from , outputting or with probability each.
Safety analysis for radar
The GAN output always places a target at exactly one pixel β but may choose the wrong pixel 50% of the time. This constitutes a false positive in target localisation β not a false detection overall, but a false position assignment that could cause a tracker to follow a hallucinated trajectory.
The MSE output is less decisive: a detection threshold of would declare no target at either pixel (correct), whereas the GAN would always trigger a detection at one of two locations (50% wrong).
For radar systems operating under the Neyman-Pearson framework, the MSE estimator with threshold control is safer than the GAN estimator whose false alarm rate is uncontrolled.
ex20-15-geometry-generalisation
ChallengeA physics-informed U-Net takes inputs and is trained on a family of MIMO sensing operators parameterised by array geometry (e.g., antenna positions). Using PAC-Bayes theory, derive a generalisation bound on the expected reconstruction error for a new geometry not seen during training.
Express the bound in terms of: training error, number of training geometries , network complexity (number of parameters ), and scene dimension .
Apply the union bound over training geometries to extend the single-geometry generalisation bound.
The PAC-Bayes prior on network weights has complexity .
The geometry-specific component of the error depends on for the nearest training geometry .
Decompose into training and geometry-transfer error
For a new geometry , find the nearest training geometry . By the triangle inequality:
PAC-Bayes bound for training error
By the PAC-Bayes theorem, with probability over training samples from geometry :
Combined generalisation bound
\delta_G = |\mathbf{G}(\boldsymbol{\alpha}^) - \mathbf{G}(\boldsymbol{\alpha}_{k^})|{\boldsymbol{\alpha}}\delta_G\delta_G\square$