Exercises
ex-ch02-01
EasyConsider the forward operator given by the matrix
For each of Hadamard's three conditions (existence, uniqueness, stability), determine whether it holds for the equation .
Compute the rank and null space of .
In finite dimensions, stability is automatic if the pseudoinverse (restricted to the range) is bounded.
Existence
, so . Existence holds for every .
Uniqueness
By rank-nullity, . The null space is . Uniqueness fails: and give the same .
Stability
On , the pseudoinverse is a bounded linear map (any linear map on a finite-dimensional space is bounded). Stability holds.
The moral: in finite dimensions, the Hadamard stability condition is automatically satisfied. Ill-conditioning is a continuous-limit phenomenon captured by the condition number, not by Hadamard's framework itself.
ex-ch02-02
EasyAn operator has singular values and left singular vectors . For which of the following data does the Picard condition hold?
(a)
(b)
(c)
The Picard condition requires .
Check case (a)
. Holds.
Check case (b)
. Fails.
Check case (c)
. Holds. (The alternating sign does not affect convergence.)
ex-ch02-03
EasyCompute the Tikhonov regularised solution for the 1D problem where is the matrix
for . Compare with the naive inverse and explain the difference.
Use the normal equation .
The Tikhonov filter for the -th component is .
Naive inverse
1/0.01 = 100$.
Tikhonov solution
\sigma_1 = 3\sigma_2 = 0.01\sigma_2^2/(\sigma_2^2 + \alpha) = 0.0001/0.0101 \approx 0.01$.
ex-ch02-04
MediumAn operator has singular values for . The noisy data has exact coefficients and noise coefficients with for all .
(a) Find the optimal truncation level for TSVD.
(b) Find the optimal Tikhonov parameter .
(c) Compare the two minimum errors.
For TSVD, the error is .
The solution coefficients are .
Identify solution coefficients
.
TSVD error and optimal $K$
\approx 1/K\approx 10^{-4} K^3/31/K^2 \approx 10^{-4} K^2K^4 \approx 10^4K^* \approx 10$.
Tikhonov optimal $\alpha$ and comparison
The optimal Tikhonov cutoff occurs at , corresponding to and , so .
Numerically, the TSVD error is slightly smaller than Tikhonov's because TSVD has infinite qualification while Tikhonov's qualification saturates at . For this problem with (solution coefficients ), both methods are near-optimal and the difference is modest.
ex-ch02-05
MediumFor the discrete Tikhonov problem with SVD , show that the discrepancy function is monotonically increasing in .
Use this to prove that the equation has a unique solution whenever .
Express explicitly in terms of the SVD components.
Differentiate each term with respect to and show positivity.
Derive the residual formula
\varphi(\alpha) = \sum_k (\alpha/(\sigma_k^2+\alpha))^2 |\langle \mathbf{y}^\delta, \mathbf{u}_k\rangle|^2$.
Monotonicity
Each term has derivative
with equality only if or . Since , we have .
Existence and uniqueness
(exact data case) and . By the intermediate value theorem, the equation has exactly one solution whenever .
ex-ch02-06
MediumProve that the Landweber filter function after iterations with step size is
and show that .
Write in the SVD basis and use the recurrence relation.
For the bound, use for .
Derive the filter
In the SVD basis, the -th component of satisfies the recurrence (starting from ):
where . This gives , confirming the filter .
Bound the filter magnitude
. Using for (with ): . Maximising over gives maximum at , yielding . A tighter analysis using calculus gives .
ex-ch02-07
MediumFor the integral operator on with singular values , determine which of the following functions satisfy a source condition of order :
(a)
(b)
(c)
A source condition of order means , so .
Check whether is in .
Identify the required decay
We need to be in , i.e., .
Check all cases
(a) . Satisfies .
(b) . Does not satisfy .
(c) . Satisfies . (In fact, case (c) satisfies a higher-order source condition .)
ex-ch02-08
MediumShow that the LASSO solution satisfies: if , then .
Use this to argue that the threshold (universal threshold) suppresses all noise-only components with high probability when .
Start from the KKT condition with .
The maximum absolute value of standard Gaussians concentrates near .
KKT conditions imply sparsity
The KKT condition at component is where . If , then and .
Conversely, if , we cannot have (which would require and ). Hence .
Universal threshold argument
The noise contribution to has each component (for near-isometric ).
By the maximum of i.i.d. Gaussians: .
Choosing ensures all noise-only components are suppressed with high probability.
ex-ch02-09
HardDerive the curvature formula for the L-curve. For Tikhonov regularization, let and .
Compute and in terms of the SVD components, and write the curvature in closed form.
Define and .
Use the chain rule to relate to .
Derivative of residual norm squared
$
Derivative of solution norm squared
$
Log-scale derivatives and curvature
\blacksquare$
ex-ch02-10
HardProve that Tikhonov regularization has qualification by showing:
(a) For , the optimal rate is achieved with .
(b) For , the rate cannot exceed regardless of the choice of .
For the bias: bound and show it equals .
For saturation: show the bias is (not ) when .
Bias bound for general $\mu$
The bias for the -th mode is . We need .
For : the maximum occurs at (or for ), giving .
For : is monotone increasing and , so bias .
Rate for $\mu \leq 2$
With bias and variance (from ), optimising gives and rate .
Saturation for $\mu > 2$
With bias and variance , optimising gives and rate . More carefully (working with squared errors): bias , variance . Optimising: gives , , and error , which in the norm gives (the exponent ).
ex-ch02-11
HardThe SURE functional for Tikhonov regularization is
(a) Show that , i.e., SURE is an unbiased estimate of the prediction error.
(b) Find the that minimises SURE for the simple case of a single component () with and .
Stein's lemma: for and differentiable : .
The trace term is the sum of the degrees of freedom.
Apply Stein's identity
Write the estimator as where . The residual is .
By Stein's lemma, .
Rearranging recovers the SURE formula. The expectation of SURE equals the prediction MSE.
Scalar case optimisation
For : .
Setting the derivative to zero and solving gives
when (the data exceeds the noise). For (pure noise), and the solution is set to zero.
ex-ch02-12
HardConsider the Born iterative method for 2D microwave tomography. The true permittivity contrast is . The forward operator at the -th iteration is
(a) Write the adjoint for .
(b) Show that the IRGNM normal equation at step is equivalent to a Tikhonov regularised linear system with matrix entries that can be computed as integrals involving and .
(c) Explain why the adjoint computation is cheaper than the forward computation by a factor of (number of receivers).
The adjoint of multiplication by is multiplication by .
The adjoint of the Green's function integral in is a backpropagation operation.
Derive the adjoint
For :
This is a backpropagation: field recorded at receivers is backpropagated to domain via the conjugate Green's function, then weighted by .
Normal equation structure
The IRGNM normal equation is .
The entry of the system matrix involves a volume integral , a sum over receivers of outer products of the forward Green's functions.
Computational advantage of adjoint
The forward computation evaluates the field at receiver positions, each requiring work (one Green's function integral per receiver per domain point): total .
The adjoint maps a vector of length to a function on of size : total as well — the same cost.
The key is that one forward/adjoint pair costs rather than for the full Jacobian. This is the basis of the efficient IRGNM implementation.
ex-ch02-13
ChallengeConsider the linear inverse problem with Gaussian noise and Gaussian prior where the prior covariance has eigenfunctions (aligned with SVD of ) and eigenvalues .
(a) Derive the MAP estimate and show it is a generalised Tikhonov solution with component-dependent regularisation parameter .
(b) The posterior covariance is . Compute its eigenvalues and interpret: which components of are data-determined and which are prior-determined?
(c) For a flat prior ( for all ), show that .
The posterior is Gaussian with mean and covariance .
Component is data-determined when .
MAP estimate
The log-posterior is
In the SVD basis, the -th component satisfies:
This is a weighted average of the data-driven and prior-mean estimates.
Posterior covariance eigenvalues
The -th eigenvalue of is
When (strong signal): — data-determined, uncertainty set by noise.
When (weak signal): — prior-determined, data provides no information.
Flat prior limit
As , and the regularisation parameter . The MAP estimate becomes , recovering the pseudoinverse.
ex-ch02-14
ChallengeConsider the 2D total variation denoising problem (ROF model):
For the 1D case with a single step edge at position ( for , for ):
(a) Show that the ROF solution has the form for and for (soft thresholding of the edge height).
(b) What is the minimum edge height that survives TV regularisation?
(c) Explain why this edge-preservation property makes TV superior to Tikhonov for piecewise-constant scenes in radar imaging.
The TV of a step function of height at one location is .
Compute the ROF functional for a step at position with height and minimise over .
Reduce to scalar problem
For a single step of height at position , any piecewise-constant solution taking value on and on has .
The ROF functional becomes .
Since we can independently choose and (the model is decoupled at the edge), the optimal (minimum from 0) and satisfies .
Solve the scalar problem
is a soft thresholding problem with threshold . For a single step (): solution is .
Therefore for and .
Edge preservation and radar interpretation
The minimum surviving edge height is . Edges with are preserved (with height reduced by ); weaker features are suppressed.
For Tikhonov regularisation, the step would be blurred over pixels — the edge is not preserved, but spread out. For radar scenes with discrete targets or building edges, this blurring destroys the spatial precision needed for target localisation. TV preserves the edge location exactly.
ex-ch02-15
ChallengeThe Inverse Crime. Consider a simulated microwave tomography experiment where you:
- Generate synthetic data using a finite-element solver on a grid.
- Add noise with .
- Reconstruct using IRGNM with the same grid and solver.
(a) Explain why this simulation setup constitutes an "inverse crime" and why the resulting reconstruction error will be misleadingly small.
(b) Propose two ways to avoid the inverse crime while keeping computation manageable.
(c) For the corrected simulation, describe how the choice of inner discretisation (forward model) vs. outer discretisation (reconstruction grid) affects the regularization parameter and the achievable reconstruction quality.
Consider what information is implicitly shared between data generation and reconstruction when the same numerical solver is used.
Think about the discretization error vs. the noise level.
The crime and its consequences
Using the same grid and solver means exactly lies in the range of the discrete forward operator . The reconstruction algorithm can therefore find such that exactly (to machine precision), regardless of whether this corresponds to a physically correct solution.
The inverse crime inflates performance: discretization errors (which would appear in reality when the forward model is imperfect) are absent, giving a reconstruction quality that far exceeds what any real system could achieve. Published error metrics from such simulations are misleading.
Remedies
-
Different grids: Generate data on a fine grid () and reconstruct on a coarser grid (). The grid mismatch introduces a model error comparable to measurement noise.
-
Different solvers: Use FEM for data generation and a simpler Born series for reconstruction. The model mismatch reflects the real-world imperfection of the forward model used in practice.
-
Measured calibration data: Use a calibrated measurement setup rather than a pure simulation.
Grid effects on regularization
With a fine forward model (high-accuracy data generation) and a coarser reconstruction grid, the mismatch error is for a -th order solver. If this dominates the measurement noise , the effective noise level seen by the reconstruction algorithm is , which requires a larger than the measurement noise alone would suggest.
In practice, the regularization parameter must be tuned to balance both measurement noise and model error — not just noise.