Preconditioning the Sensing Operator

Why Preconditioning Accelerates Reconstruction

Iterative reconstruction algorithms (gradient descent, ISTA, ADMM, OAMP) converge at a rate determined by the condition number of AHA\mathbf{A}^{H}\mathbf{A}. When ΞΊ(A)≫1\kappa(\mathbf{A}) \gg 1 --- typical for RF imaging with incomplete angular coverage --- convergence can be painfully slow: the number of iterations scales linearly with ΞΊ\kappa for first-order methods and as ΞΊ\sqrt{\kappa} for accelerated methods. Preconditioning transforms the problem so that the effective condition number is closer to 1, dramatically reducing the iteration count. The Kronecker structure of A\mathbf{A} makes certain preconditioners computationally cheap.

Definition:

Preconditioner

A preconditioner for the normal equations AHAc=AHy\mathbf{A}^{H}\mathbf{A}\mathbf{c} = \mathbf{A}^{H}\mathbf{y} is an invertible matrix P\mathbf{P} such that the preconditioned system

Pβˆ’1AHAc=Pβˆ’1AHy\mathbf{P}^{-1}\mathbf{A}^{H}\mathbf{A}\mathbf{c} = \mathbf{P}^{-1}\mathbf{A}^{H}\mathbf{y}

has a smaller effective condition number ΞΊ(Pβˆ’1AHA)β‰ͺΞΊ(AHA)\kappa(\mathbf{P}^{-1}\mathbf{A}^{H}\mathbf{A}) \ll \kappa(\mathbf{A}^{H}\mathbf{A}).

An ideal preconditioner approximates AHA\mathbf{A}^{H}\mathbf{A} while being cheap to apply: Pβ‰ˆAHA\mathbf{P} \approx \mathbf{A}^{H}\mathbf{A} but Pβˆ’1v\mathbf{P}^{-1}\mathbf{v} costs O(N)O(N) or O(Nlog⁑N)O(N \log N).

Definition:

Diagonal (Jacobi) Preconditioner

The simplest preconditioner is the diagonal (Jacobi) preconditioner:

Pdiag=diag(AHA)=diag(βˆ₯A1βˆ₯2,…,βˆ₯ANβˆ₯2)\mathbf{P}_{\text{diag}} = \text{diag}(\mathbf{A}^{H}\mathbf{A}) = \text{diag}\bigl(\|\mathbf{A}_{1}\|^2, \ldots, \|\mathbf{A}_{N}\|^2\bigr)

where Aq\mathbf{A}_{q} is the qq-th column of A\mathbf{A}. This equalizes the column norms, correcting for the non-uniform illumination pattern.

For a Kronecker-structured matrix, the diagonal of AHA\mathbf{A}^{H}\mathbf{A} is itself a Kronecker product:

diag(AHA)=diag(A3HA3)βŠ—diag(A2HA2)βŠ—diag(A1HA1)\text{diag}(\mathbf{A}^{H}\mathbf{A}) = \text{diag}(\mathbf{A}_{3}^{H}\mathbf{A}_{3}) \otimes \text{diag}(\mathbf{A}_{2}^{H}\mathbf{A}_{2}) \otimes \text{diag}(\mathbf{A}_{1}^{H}\mathbf{A}_{1})

requiring storage for only n1+n2+n3n_1 + n_2 + n_3 values.

In Caire's framework (Eq. 24--25), the backpropagation operator c^BP=AHDβˆ’1y\hat{\mathbf{c}}^{\text{BP}} = \mathbf{A}^{H}\mathbf{D}^{-1}\mathbf{y} uses a diagonal matrix D\mathbf{D} that accounts for path loss and antenna gain variations. This is precisely the diagonal preconditioner applied in the measurement domain.

Theorem: Kronecker Preconditioning via Factor Inversion

For A=A3βŠ—A2βŠ—A1\mathbf{A} = \mathbf{A}_{3} \otimes \mathbf{A}_{2} \otimes \mathbf{A}_{1}, define the Kronecker preconditioner:

PKron=(A3HA3+Ξ±3I)βŠ—(A2HA2+Ξ±2I)βŠ—(A1HA1+Ξ±1I)\mathbf{P}_{\text{Kron}} = (\mathbf{A}_{3}^{H}\mathbf{A}_{3} + \alpha_3 \mathbf{I}) \otimes (\mathbf{A}_{2}^{H}\mathbf{A}_{2} + \alpha_2 \mathbf{I}) \otimes (\mathbf{A}_{1}^{H}\mathbf{A}_{1} + \alpha_1 \mathbf{I})

where Ξ±k>0\alpha_k > 0 are regularization parameters. Then:

  1. PKronβˆ’1\mathbf{P}_{\text{Kron}}^{-1} can be applied in O(n13+n23+n33)O(n_1^3 + n_2^3 + n_3^3) time by inverting each factor separately.

  2. The effective condition number is

    ΞΊ(PKronβˆ’1AHA)=∏k=13Οƒmax⁑2(Ak)+Ξ±kΟƒmin⁑2(Ak)+Ξ±kβ‹…Οƒmin⁑2(Ak)Οƒmax⁑2(Ak).\kappa(\mathbf{P}_{\text{Kron}}^{-1}\mathbf{A}^{H}\mathbf{A}) = \prod_{k=1}^{3} \frac{\sigma_{\max}^2(\mathbf{A}_{k}) + \alpha_k}{\sigma_{\min}^2(\mathbf{A}_{k}) + \alpha_k} \cdot \frac{\sigma_{\min}^2(\mathbf{A}_{k})}{\sigma_{\max}^2(\mathbf{A}_{k})}.

  3. With Ξ±k=0\alpha_k = 0, PKronβˆ’1AHA=I\mathbf{P}_{\text{Kron}}^{-1}\mathbf{A}^{H}\mathbf{A} = \mathbf{I} (perfect preconditioning), but this requires each factor to be invertible.

Since AHA\mathbf{A}^{H}\mathbf{A} factors as a Kronecker product, we can invert it factor by factor instead of inverting the full NΓ—NN \times N matrix. The regularization parameters Ξ±k\alpha_k handle the case when individual factors are rank-deficient, at the cost of imperfect preconditioning.

Example: Preconditioned CG for Tikhonov Reconstruction

A 2D imaging system with A=AangβŠ—Af\mathbf{A} = \mathbf{A}_{\text{ang}} \otimes \mathbf{A}_{f} has ΞΊ(A)=200\kappa(\mathbf{A}) = 200. The Tikhonov solution

c^=(AHA+Ξ»I)βˆ’1AHy\hat{\mathbf{c}} = (\mathbf{A}^{H}\mathbf{A} + \lambda\mathbf{I})^{-1}\mathbf{A}^{H}\mathbf{y}

is computed by conjugate gradient (CG) on the normal equations.

(a) How many CG iterations are needed without preconditioning? (b) How many with the Kronecker preconditioner, given ΞΊ(Aang)=20\kappa(\mathbf{A}_{\text{ang}}) = 20 and ΞΊ(Af)=10\kappa(\mathbf{A}_{f}) = 10?

Preconditioned CG with Kronecker Structure

Complexity: Each iteration: O(n12n2n3+n1n22n3+n1n2n32)O(n_1^2 n_2 n_3 + n_1 n_2^2 n_3 + n_1 n_2 n_3^2) for the Kronecker matvec, plus O(n12n2n3+…)O(n_1^2 n_2 n_3 + \ldots) for the preconditioner. Total iterations: O(ΞΊpreclog⁑(1/Ο΅))O(\sqrt{\kappa_{\text{prec}}}\log(1/\epsilon)).
Input: Factor matrices A1,A2,A3\mathbf{A}_{1}, \mathbf{A}_{2}, \mathbf{A}_{3};
observations y\mathbf{y}; regularization Ξ±1,Ξ±2,Ξ±3\alpha_1, \alpha_2, \alpha_3;
tolerance Ο΅\epsilon.
Precompute:
Pkβˆ’1=(AkHAk+Ξ±kI)βˆ’1\mathbf{P}_k^{-1} = (\mathbf{A}_{k}^{H}\mathbf{A}_{k} + \alpha_k\mathbf{I})^{-1} for k=1,2,3k = 1,2,3
(three small eigendecompositions).
Initialize:
c0=0\mathbf{c}_{0} = \mathbf{0},
r0=AHy\mathbf{r}_0 = \mathbf{A}^{H}\mathbf{y} (Kronecker matvec),
z0=(P3βˆ’1βŠ—P2βˆ’1βŠ—P1βˆ’1)r0\mathbf{z}_0 = (\mathbf{P}_3^{-1} \otimes \mathbf{P}_2^{-1} \otimes \mathbf{P}_1^{-1})\mathbf{r}_0 (Kronecker matvec),
d0=z0\mathbf{d}_0 = \mathbf{z}_0.
for t=0,1,2,…t = 0, 1, 2, \ldots:
qt=(AHA+diag(Ξ±))dt\quad \mathbf{q}_t = (\mathbf{A}^{H}\mathbf{A} + \text{diag}(\alpha))\mathbf{d}_t (Kronecker matvec)
Ξ²t=rtHzt/dtHqt\quad \beta_t = \mathbf{r}_t^H\mathbf{z}_t / \mathbf{d}_t^H\mathbf{q}_t
ct+1=ct+Ξ²tdt\quad \mathbf{c}_{t+1} = \mathbf{c}_{t} + \beta_t \mathbf{d}_t
rt+1=rtβˆ’Ξ²tqt\quad \mathbf{r}_{t+1} = \mathbf{r}_t - \beta_t \mathbf{q}_t
\quad if βˆ₯rt+1βˆ₯<Ο΅\|\mathbf{r}_{t+1}\| < \epsilon: return ct+1\mathbf{c}_{t+1}
zt+1=Pβˆ’1rt+1\quad \mathbf{z}_{t+1} = \mathbf{P}^{-1}\mathbf{r}_{t+1} (Kronecker matvec)
Ξ³t=rt+1Hzt+1/rtHzt\quad \gamma_t = \mathbf{r}_{t+1}^H\mathbf{z}_{t+1}/\mathbf{r}_t^H\mathbf{z}_t
dt+1=zt+1+Ξ³tdt\quad \mathbf{d}_{t+1} = \mathbf{z}_{t+1} + \gamma_t\mathbf{d}_t

The preconditioner application Pβˆ’1r\mathbf{P}^{-1}\mathbf{r} is itself a Kronecker matvec: reshape r\mathbf{r} into a tensor and multiply each mode by Pkβˆ’1\mathbf{P}_k^{-1}. No large matrix is ever formed.

Preconditioning Effect on Convergence

Compare the convergence of unpreconditioned and Kronecker-preconditioned conjugate gradient for a Tikhonov reconstruction problem. Observe how the number of iterations drops dramatically with preconditioning, especially for ill-conditioned systems.

Parameters
8
16
60
0.01
⚠️Engineering Note

k-Space Density Compensation (Reweighting)

When the k-space sampling is non-uniform (typical for multi-static systems), the backprojection image AHy\mathbf{A}^{H}\mathbf{y} overemphasizes densely sampled regions. Density compensation reweights the measurements before backprojection:

c^DC=AHWy\hat{\mathbf{c}}^{\text{DC}} = \mathbf{A}^{H}\mathbf{W}\mathbf{y}

where W=diag(w1,…,wM)\mathbf{W} = \text{diag}(w_1, \ldots, w_M) and wmw_m is inversely proportional to the local k-space sampling density around the mm-th measurement.

This is equivalent to a left preconditioner that approximately whitens the normal equations. For Kronecker-structured A\mathbf{A}, the weights factor as W=W3βŠ—W2βŠ—W1\mathbf{W} = \mathbf{W}_3 \otimes \mathbf{W}_2 \otimes \mathbf{W}_1 when the sampling density is separable.

Quick Check

Which statement about Kronecker preconditioning is correct?

The preconditioner application costs O(N3)O(N^3) where NN is the number of voxels.

The Kronecker preconditioner inverts each factor separately, costing O(n13+n23+n33)O(n_1^3 + n_2^3 + n_3^3).

Preconditioning increases the condition number to improve convergence.

Diagonal preconditioning is always sufficient for RF imaging.

Key Takeaway

Preconditioning reduces the effective condition number and accelerates iterative reconstruction. The Kronecker structure enables a powerful preconditioner PKron\mathbf{P}_{\text{Kron}} that inverts each factor separately at cost O(nk3)O(n_k^3) instead of O(N3)O(N^3). For a typical system, this reduces CG iterations from thousands to tens. k-Space density compensation provides an alternative view as measurement-domain reweighting.