Ferkans — Interactive Telecom Tutor

Making Reconstruction Algorithms Practical

The theoretical framework of Parts IV--VI develops a rich menu of reconstruction algorithms: matched filtering, Tikhonov regularization, ISTA/FISTA, ADMM, OAMP, and learned unrolled networks. Whether any of these can run in real time on a practical imaging system depends entirely on how efficiently they exploit the Kronecker structure of $\mathbf{A}$ . This section provides the complexity analysis for each major algorithm class, showing exactly where the Kronecker factorization enters and what savings it provides.

Definition:
Matched Filter (Backprojection) Complexity

The matched filter estimate is

$\hat{\mathbf{c}}^{\text{MF}} = \mathbf{A}^{H}\mathbf{y}.$

Method	Cost
Naive	$O(MN)$
Kronecker	$O(m_1 n_1 m_2 m_3 + n_1 m_2 n_2 m_3 + n_1 n_2 m_3 n_3)$
Kronecker + FFT	$O(N \log N)$ (uniform grids)

The adjoint $\mathbf{A}^{H}$ has the same Kronecker structure as $\mathbf{A}$ (with conjugate-transposed factors), so the sequential mode-product algorithm applies directly.

Theorem: LMMSE via Kronecker Factorization

The LMMSE estimator for the model $\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w}$ with prior $\mathbf{c} \sim \mathcal{CN}(\mathbf{0}, \sigma^2_{c}\mathbf{I})$ is

$\hat{\mathbf{c}}^{\text{LMMSE}} = (\mathbf{A}^{H}\mathbf{A} + \alpha\mathbf{I})^{-1}\mathbf{A}^{H}\mathbf{y}$

where $\alpha = \sigma^2/\sigma^2_{c}$ . When $\mathbf{A} = \mathbf{A}_{3} \otimes \mathbf{A}_{2} \otimes \mathbf{A}_{1}$ :

$(\mathbf{A}^{H}\mathbf{A} + \alpha\mathbf{I})^{-1} = \bigotimes_{k=1}^{3} (\mathbf{A}_{k}^{H}\mathbf{A}_{k} + \alpha_k\mathbf{I})^{-1}$

only when $\alpha = \alpha_1 \alpha_2 \alpha_3$ and each factor regularization is chosen consistently. More practically, the inversion is computed via the eigendecomposition of each factor:

$\mathbf{A}_{k}^{H}\mathbf{A}_{k} = \mathbf{V}_k\boldsymbol{\Lambda}_k\mathbf{V}_k^H \implies (\mathbf{A}_{k}^{H}\mathbf{A}_{k} + \alpha_k\mathbf{I})^{-1} = \mathbf{V}_k(\boldsymbol{\Lambda}_k + \alpha_k\mathbf{I})^{-1}\mathbf{V}_k^H.$

Complexity: $O(n_1^3 + n_2^3 + n_3^3)$ for the precomputation (three eigendecompositions), then $O(N^{4/3})$ per application.

The point is that we never need to form or invert the full $N \times N$ Gram matrix. Each factor is a small matrix (typically $n_k \leq 64$ ), so its eigendecomposition is nearly instantaneous. This is what makes the LMMSE step in OAMP (Ch 17.3) computationally feasible for RF imaging.

Proof

Kronecker eigendecomposition

$\mathbf{A}^{H}\mathbf{A} = (\mathbf{V}_3 \otimes \mathbf{V}_2 \otimes \mathbf{V}_1)(\boldsymbol{\Lambda}_3 \otimes \boldsymbol{\Lambda}_2 \otimes \boldsymbol{\Lambda}_1)(\mathbf{V}_3 \otimes \mathbf{V}_2 \otimes \mathbf{V}_1)^H.$ $

Regularized inverse

The regularized inverse has eigenvalues $(\lambda_i^{(3)}\lambda_j^{(2)}\lambda_l^{(1)} + \alpha)^{-1}$ .

When $\alpha$ admits a factored form, this simplifies to the Kronecker product of factor inverses. In general, one uses the factored eigendecomposition and applies the inverse element-wise on the eigenvalues, then transforms back:

$(\mathbf{A}^{H}\mathbf{A} + \alpha\mathbf{I})^{-1}\mathbf{v} = (\mathbf{V}_3 \otimes \mathbf{V}_2 \otimes \mathbf{V}_1)\boldsymbol{\Gamma}[(\mathbf{V}_3 \otimes \mathbf{V}_2 \otimes \mathbf{V}_1)^H\mathbf{v}]$

where $\boldsymbol{\Gamma}$ is diagonal with entries $1/(\lambda_i\lambda_j\lambda_l + \alpha)$ . $\blacksquare$

Example: OAMP Complexity with Kronecker Structure

An OAMP algorithm (Ch 17) for a 3D imaging problem with $n_1 = n_2 = n_3 = 32$ voxels per dimension runs for $T = 10$ iterations. Each iteration requires one LMMSE step and one denoiser application. Compare the total computational cost with and without Kronecker exploitation.

Solution

Without Kronecker

The LMMSE step requires solving $(\mathbf{A}^{H}\mathbf{A} + \tau^{-1}\mathbf{I})^{-1}\mathbf{A}^{H}\mathbf{y}$ .

Direct inversion of $\mathbf{A}^{H}\mathbf{A}$ : $O(N^3) = O(32^9) \approx 3.5 \times 10^{13}$ flops. Even with CG: $O(\kappa \cdot MN) \approx 200 \times 8192 \times 32768 \approx 5.4 \times 10^{10}$ flops per iteration.

Total for 10 OAMP iterations: $\sim 5.4 \times 10^{11}$ flops $\approx$ several minutes on a GPU.

With Kronecker

Precompute three eigendecompositions: $3 \times O(32^3) = 3 \times 32768 \approx 10^5$ flops.

Each LMMSE application: two Kronecker matvecs ( $\mathbf{V}^H\mathbf{v}$ and $\mathbf{V}\boldsymbol{\Gamma}\mathbf{u}$ ) plus element-wise division. Cost: $O(N^{4/3}) = O(32^4) \approx 10^6$ flops per application.

Total for 10 iterations: $10 \times 2 \times 10^6 = 2 \times 10^7$ flops $\approx$ milliseconds on a GPU.

Speedup: $\sim 25{,}000\times$ .

Summary

The Kronecker structure transforms OAMP from a minutes-scale computation to a milliseconds-scale computation. This is what makes iterative model-based reconstruction practical for real-time RF imaging at millimeter-wave frequencies.

Computational Complexity of Reconstruction Methods

Algorithm	Naive Complexity	Kronecker Complexity	With FFT
Matched filter $\mathbf{A}^{H}\mathbf{y}$	$O(MN)$	$O(N^{4/3})$	$O(N\log N)$
Tikhonov (direct solve)	$O(N^3)$	$O(n_1^3 + n_2^3 + n_3^3)$	N/A
CG per iteration	$O(MN)$	$O(N^{4/3})$	$O(N\log N)$
ISTA/FISTA per iteration	$O(MN)$	$O(N^{4/3})$	$O(N\log N)$
ADMM per iteration	$O(MN + N^3)$	$O(N^{4/3} + n_k^3)$	$O(N\log N + n_k^3)$
OAMP LMMSE step	$O(N^3)$	$O(N^{4/3})$	$O(N\log N)$
OAMP denoiser (CNN)	$O(N)$	$O(N)$	$O(N)$

🎓CommIT Contribution(2024)

Multi-View RF Imaging Framework and Kronecker Structure

A. Rezaei, L. Manzoni, G. Caire — CommIT Group, TU Berlin (internal note)

The CommIT group's RF imaging simulator implements the full Kronecker-structured forward model and its adjoint as GPU-accelerated tensor operations. The key architectural decisions are:

Sensing operator as a function, not a matrix: The operator $\mathbf{A}$ is never stored explicitly. Forward ( $\mathbf{A}\mathbf{c}$ ) and adjoint ( $\mathbf{A}^{H}\mathbf{y}$ ) operations are implemented as sequential mode products on the reflectivity tensor.
Factor precomputation: The three factor matrices and their eigendecompositions are precomputed once per imaging geometry, enabling instant LMMSE steps during OAMP iterations.
Multi-view fusion: Each Tx-Rx pair produces a per-pair sensing operator $\mathbf{A}_{i,j}$ (Caire's Eq. 52--55). The Kronecker structure applies to each pair independently, and the per-pair reconstructions are fused into a global image.

This framework enables real-time imaging at millimeter-wave frequencies with standard GPU hardware, bridging the gap between theoretical models and practical deployable systems.

kroneckersimulatorGPUCommIT

Kronecker vs. Full Matrix-Vector Product Timing

Compare wall-clock time for Kronecker-exploiting matvec versus the naive full matrix-vector product as the problem size grows. Observe the dramatic speedup that makes iterative reconstruction practical for large-scale 3D imaging.

Parameters

Min grid size per dim4

Max grid size per dim32

Number of grid sizes6

Common Mistake: Inexact Kronecker Factorization in LMMSE

Mistake:

Assuming that $(\mathbf{A}^{H}\mathbf{A} + \alpha\mathbf{I})^{-1}$ always factors as a Kronecker product. This is true only when $\alpha = 0$ or when $\alpha$ admits a specific multiplicative decomposition.

Correction:

For general $\alpha > 0$ , the exact inverse does not factor as a Kronecker product because $(\mathbf{B} \otimes \mathbf{C} + \alpha\mathbf{I})^{-1} \neq (\mathbf{B} + \alpha_1\mathbf{I})^{-1} \otimes (\mathbf{C} + \alpha_2\mathbf{I})^{-1}$ in general. The workaround is to use the joint eigendecomposition: diagonalize each factor via SVD, then apply the inverse in the joint eigenbasis with entry-wise regularization $1/(\lambda_i\mu_j\nu_l + \alpha)$ . This costs $O(N^{4/3})$ per application and is exact.

Mode- $k$ product

The multiplication of a tensor $\boldsymbol{\mathcal{X}} \in \mathbb{R}^{n_1 \times \cdots \times n_K}$ by a matrix $\mathbf{M} \in \mathbb{R}^{m \times n_k}$ along its $k$ -th mode, producing a tensor of size $n_1 \times \cdots \times m \times \cdots \times n_K$ . The mode- $k$ product is the fundamental operation for exploiting Kronecker structure in matrix-vector products.

Point-spread function (PSF)

The imaging system's response to a point scatterer. In the discrete setting, the PSF is encoded in the columns of the Gram matrix $\mathbf{G} = \mathbf{A}^{H}\mathbf{A}$ : the $q$ -th column is the image of a unit point scatterer at voxel $q$ .

Related: {{Ref:Def Gram Psf}}

Quick Check

For balanced Kronecker factors ( $m_k \approx n_k \approx n$ , three factors), what is the naive matvec complexity vs. Kronecker matvec?

Naive: $O(n^3)$ , Kronecker: $O(n^2)$

Naive: $O(n^6)$ , Kronecker: $O(n^4)$

Naive: $O(n^6)$ , Kronecker: $O(n^3\log n)$

Both are $O(n^6)$

Correction:

Naive:

O(n^6)

, Kronecker:

O(n^4)

Naive: $O(MN) = O(n^3 \cdot n^3) = O(n^6)$ . Kronecker: $O(n \cdot n \cdot n^2 \cdot n^2 + \ldots) = O(n^4)$ .

Key Takeaway

The Kronecker structure of $\mathbf{A}$ enters every reconstruction algorithm through the matvec $\mathbf{A}\mathbf{c}$ and the LMMSE solve $(\mathbf{A}^{H}\mathbf{A} + \alpha\mathbf{I})^{-1}$ . For balanced factors, the matvec drops from $O(n^6)$ to $O(n^4)$ (or $O(n^3 \log n)$ with FFT), and the LMMSE solve drops from $O(n^9)$ to $O(n^3)$ via factored eigendecomposition. This transforms OAMP from minutes to milliseconds, making model-based iterative reconstruction practical for real-time RF imaging. The CommIT simulator implements this architecture.

Computational Complexity and Kronecker Exploitation

Making Reconstruction Algorithms Practical

Definition: Matched Filter (Backprojection) Complexity

Theorem: LMMSE via Kronecker Factorization

Kronecker eigendecomposition

Regularized inverse

Example: OAMP Complexity with Kronecker Structure

Without Kronecker

With Kronecker

Summary

Computational Complexity of Reconstruction Methods

Multi-View RF Imaging Framework and Kronecker Structure

Kronecker vs. Full Matrix-Vector Product Timing

Parameters

Common Mistake: Inexact Kronecker Factorization in LMMSE

Mode-kkk product

Point-spread function (PSF)

Quick Check

Key Takeaway

Definition:
Matched Filter (Backprojection) Complexity

Mode- $k$ product