Nonlinear Inverse Problems

Beyond Linearity — When the Forward Model Is Nonlinear

So far we have treated the forward operator A\mathcal{A} as linear. Many RF imaging problems, however, involve nonlinear forward models:

  • Microwave tomography: The scattered field depends nonlinearly on the permittivity distribution (multiple scattering beyond the Born approximation).
  • Full-waveform inversion: The wave equation maps medium parameters to recorded waveforms nonlinearly via multiple reflections.
  • Phase retrieval: Only y2|y|^2 is measured (intensity-only), and the phase is lost — a fundamentally nonlinear problem.

The linearization-based approach — compute the Fréchet derivative, apply regularized linear inversion, update — is the workhorse of practical nonlinear imaging. Born iterative and distorted-Born iterative methods are the radar-specific instances.

This section is intentionally brief: most of our forward models in Parts II–III use the linearized (Born) approximation, and the full nonlinear theory is treated in Chapters 5–6 when we derive the scattering equations.

Definition:

Nonlinear Inverse Problems

A nonlinear inverse problem seeks xx from the equation

F(x)=y,\mathcal{F}(x) = y,

where F ⁣:D(F)XY\mathcal{F} \colon \mathcal{D}(\mathcal{F}) \subseteq \mathcal{X} \to \mathcal{Y} is a nonlinear operator. The Fréchet derivative F(x)\mathcal{F}'(x) at a point xx is the bounded linear operator satisfying

F(x+h)=F(x)+F(x)h+o(h).\mathcal{F}(x + h) = \mathcal{F}(x) + \mathcal{F}'(x)\,h + o(\|h\|).

F(x)\mathcal{F}'(x) is the Jacobian of the forward operator and plays the role of A\mathcal{A} in the linear theory.

Nonlinear ill-posedness adds to the linear difficulties:

  • Non-uniqueness from nonlinearity (multiple solutions, local minima).
  • Non-convexity of the residual functional.
  • Dependence of the linearization on the current iterate.
,

Definition:

The Gauss–Newton Method and IRGNM

The Gauss–Newton method linearises F\mathcal{F} at the current iterate xnx_n and solves the resulting linear inverse problem:

xn+1=xn+Δxn,x_{n+1} = x_n + \Delta x_n,

where Δxn\Delta x_n solves the linearised problem

F(xn)Δx=yδF(xn).\mathcal{F}'(x_n)\,\Delta x = y^\delta - \mathcal{F}(x_n).

Since F(xn)\mathcal{F}'(x_n) is typically ill-posed (a compact operator), the linearised problem must itself be regularised. Using Tikhonov regularisation on each linearised step gives the iteratively regularised Gauss–Newton method (IRGNM):

Δxn=argminΔx{F(xn)Δx(yδF(xn))2+αnΔx+xnx02},\Delta x_n = \arg\min_{\Delta x} \left\{ \|\mathcal{F}'(x_n)\Delta x - (y^\delta - \mathcal{F}(x_n))\|^2 + \alpha_n \|\Delta x + x_n - x_0\|^2 \right\},

where αn\alpha_n is a decreasing sequence of regularisation parameters and x0x_0 is an initial guess (e.g., the matched-filter image).

The IRGNM is the standard method for nonlinear ill-posed problems. The decreasing αn\alpha_n allows progressively finer resolution as the iterate approaches the solution. A typical choice is αn=α0qn\alpha_n = \alpha_0 q^n with q(0,1)q \in (0,1) and α0\alpha_0 chosen via the discrepancy principle at the first iteration.

,

Definition:

The Levenberg–Marquardt Method

The Levenberg–Marquardt method is closely related to Gauss–Newton but uses the discrepancy principle to select αn\alpha_n adaptively at each step:

Choose αn>0\alpha_n > 0 such that

F(xn)(xn+1xn)(yδF(xn))=τnyδF(xn),\|\mathcal{F}'(x_n)(x_{n+1} - x_n) - (y^\delta - \mathcal{F}(x_n))\| = \tau_n \|y^\delta - \mathcal{F}(x_n)\|,

where τn<1\tau_n < 1. This ensures that each step reduces the residual by a fixed fraction. The update is:

xn+1=xn+(F(xn)F(xn)+αnI)1F(xn)(yδF(xn)).x_{n+1} = x_n + (\mathcal{F}'(x_n)^*\mathcal{F}'(x_n) + \alpha_n I)^{-1} \mathcal{F}'(x_n)^*(y^\delta - \mathcal{F}(x_n)).

Levenberg–Marquardt interpolates between Gauss–Newton (αn0\alpha_n \to 0) and gradient descent (αn\alpha_n \to \infty). Far from the solution (large residual), αn\alpha_n is large and the step is short and robust. Near the solution (small residual), αn\alpha_n is small and the method converges rapidly.

Theorem: Convergence of the IRGNM

Let F\mathcal{F} be Fréchet differentiable with a Lipschitz-continuous derivative in a ball Bρ(x)B_\rho(x^\dagger). Assume a nonlinear source condition:

x0x=(F(x)F(x))μ/2w,wE,x_0 - x^\dagger = (\mathcal{F}'(x^\dagger)^* \mathcal{F}'(x^\dagger))^{\mu/2} w, \qquad \|w\| \leq E,

for some μ(0,1]\mu \in (0, 1]. If αn=α0qn\alpha_n = \alpha_0 q^n with appropriate α0,q\alpha_0, q and the iteration is stopped by the discrepancy principle at n(δ)n^*(\delta), then

xnx=O(δ2μ/(2μ+1))as δ0.\|x_{n^*} - x^\dagger\| = O\bigl(\delta^{2\mu/(2\mu+1)}\bigr) \quad \text{as } \delta \to 0.

The convergence rate matches the linear theory. The nonlinearity is handled by the fact that the linearisation error is of higher order: F(xn)F(x)F(x)(xnx)=o(xnx)\|\mathcal{F}(x_n) - \mathcal{F}(x^\dagger) - \mathcal{F}'(x^\dagger)(x_n - x^\dagger)\| = o(\|x_n - x^\dagger\|), which becomes negligible as xnxx_n \to x^\dagger.

,

Example: Born Iterative Method for Microwave Tomography

In microwave tomography, the forward model maps the contrast function χ(r)=εr(r)1\chi(\mathbf{r}) = \varepsilon_r(\mathbf{r}) - 1 (relative permittivity minus 1) to the scattered field EsE^s via the Lippmann–Schwinger integral equation:

Es(r)=k02ΩG0(r,r)χ(r)E(r)dr,E^s(\mathbf{r}) = k_0^2 \int_\Omega G_0(\mathbf{r}, \mathbf{r}') \chi(\mathbf{r}') E(\mathbf{r}')\,d\mathbf{r}',

where G0G_0 is the free-space Green's function and E=Ei+EsE = E^i + E^s is the total field. The nonlinearity arises because EE depends on χ\chi through the same equation.

Describe the Born iterative method (BIM) as a special case of IRGNM.

,
⚠️Engineering Note

Computational Cost of Nonlinear Imaging

Each IRGNM iteration requires: (1) A forward solve: compute F(xn)\mathcal{F}(x_n) by solving Maxwell's equations (e.g., FDTD or FEM) — O(N3)O(N^3) for 3D problems. (2) A Jacobian solve: compute F(xn)\mathcal{F}'(x_n) and apply it — another O(N3)O(N^3) solve per measurement. (3) A regularised linear solve: the Tikhonov normal equation at the current linearisation point.

For a 1003100^3 grid (modest 3D imaging domain), a single forward solve takes seconds to minutes on current hardware. The IRGNM typically needs 10105050 outer iterations, each with multiple inner CG iterations.

This cost is why most operational RF imaging systems use the Born approximation (single linearisation, no iteration) or limit iteration to 3–5 steps. Full nonlinear inversion is feasible only offline (post-processing) or with GPU-accelerated solvers and reduced-order models.

Common Mistake: Nonlinear Imaging Can Converge to a Local Minimum

Mistake:

Initialising the IRGNM from a zero starting point x0=0x_0 = 0 (no prior information) for a strongly nonlinear problem, expecting global convergence.

Correction:

The IRGNM convergence theorem (TConvergence of the IRGNM) assumes that the initial guess x0x_0 is sufficiently close to xx^\dagger (within the ball Bρ(x)B_\rho(x^\dagger)). For strongly scattering media, ρ\rho can be very small and x0=0x_0 = 0 may not satisfy this condition.

Practical mitigation strategies:

  1. Use the matched-filter (backprojection) image as x0x_0 — it is often close enough to the true solution for the linearisation to be valid.

  2. Perform a frequency continuation sweep: start at low frequencies (large λ\lambda, mild nonlinearity, large convergence basin) and progressively increase frequency.

  3. Use a multi-scale approach: solve a coarsely discretised problem first, then refine the grid and warm-start from the coarse solution.

Methods for Nonlinear Inverse Problems

MethodKey IdeaConvergence RateComputational CostBest For
Born approximation (single)Linearise once, no iterationNot convergent (one step)One forward solve + one linear solveWeak scattering; real-time
Born iterative (BIM)Iterate linearisation with free-space Green's functionGeometric (slow, weak scattering)O(Niter)O(N_{\text{iter}}) forward solvesModerate contrast
Distorted-Born iterative (DBIM)Iterate with background Green's functionFaster than BIMMore expensive per step (Jacobian with background)Higher contrast
IRGNM (Tikhonov inner)Tikhonov at each linearisationO(δ2μ/(2μ+1))O(\delta^{2\mu/(2\mu+1)}) optimalO(Niter)O(N_{\text{iter}}) forward + adjoint solvesGeneral nonlinear ill-posed problems
Levenberg–MarquardtAdaptive αn\alpha_n via discrepancySame as IRGNMSimilar to IRGNMRobust to poor initialisation

Key Takeaway

Nonlinear inverse problems arise whenever the forward model involves multiple scattering, nonlinear constitutive relations, or intensity-only measurements. The iteratively regularised Gauss–Newton method (IRGNM) is the standard approach: linearise at each step and apply Tikhonov regularisation to the linearised problem. Convergence rates match the linear theory under appropriate source conditions — the nonlinearity becomes negligible near the solution. In RF imaging, the Born iterative method is the specific instantiation of IRGNM for electromagnetic scattering, using the Lippmann–Schwinger equation as the forward model. Computational cost limits full nonlinear inversion to offline processing or GPU-accelerated implementations.