Ferkans — Interactive Telecom Tutor

LASSO and Basis Pursuit for RF Imaging

This section connects the LASSO and Basis Pursuit recovery programs to practical RF image reconstruction. The algorithm derivations (ISTA/FISTA, ADMM) live in RFI Ch 04 and Telecom Ch 03; here we focus on the imaging-specific aspects: setting up the optimization for the RF sensing matrix $\mathbf{A}$ , choosing the regularization parameter $\lambda$ , handling complex-valued reflectivity, and debiasing the solution on the estimated support.

Definition:
Basis Pursuit for Noiseless RF Imaging

When the noise level is negligible (high $\text{SNR}$ ), solve the Basis Pursuit problem:

$\hat{\mathbf{c}}_{\text{BP}} = \arg\min_{\mathbf{c}} \|\mathbf{c}\|_1 \quad \text{s.t.} \quad \mathbf{A}\mathbf{c} = \mathbf{y}.$

This is a linear program (for real signals) or a second-order cone program (SOCP, for complex signals) solvable by interior-point methods in $O(N^3)$ .

When to use BP:

Known exact forward model (no model mismatch).
Very high $\text{SNR}$ ( $> 40$ dB).
Moderate problem size ( $N \lesssim 10^4$ ).

For large-scale imaging, BP is too expensive; the LASSO solved by FISTA or ADMM is preferred.

Definition:
LASSO and BPDN for Noisy RF Imaging

BPDN (constrained form):

$\hat{\mathbf{c}} = \arg\min_{\mathbf{c}} \|\mathbf{c}\|_1 \quad \text{s.t.} \quad \|\mathbf{A}\mathbf{c} - \mathbf{y}\|_2 \leq \epsilon,$

where $\epsilon$ is the noise budget ( $\epsilon \approx \sqrt{M}\sigma$ ).

LASSO (penalized form):

$\hat{\mathbf{c}} = \arg\min_{\mathbf{c}} \frac{1}{2}\|\mathbf{A}\mathbf{c} - \mathbf{y}\|_2^2 + \lambda\|\mathbf{c}\|_1.$

Equivalence: For every $\epsilon > 0$ , there exists a $\lambda > 0$ such that the BPDN and LASSO solutions coincide (and vice versa). The mapping $\epsilon \leftrightarrow \lambda$ depends on the data.

For RF imaging: The LASSO is almost always preferred because:

Efficient first-order solvers (FISTA, ADMM) are available.
$\lambda$ is easier to tune than $\epsilon$ .
Same theoretical guarantees (§The MIMO Sensing Matrix).

Theorem: Complex Soft-Thresholding for RF Imaging

RF measurements are complex-valued. The proximal operator of $\lambda\|\cdot\|_1$ for complex vectors is the complex soft-thresholding:

$\mathcal{S}_{\lambda}(z) = \begin{cases} z\,(1 - \lambda/|z|) & |z| > \lambda, \\ 0 & |z| \leq \lambda. \end{cases}$

This shrinks the magnitude and preserves the phase.

Phase preservation is essential for coherent RF imaging where phase carries spatial information. Unlike real soft-thresholding (which simply shifts toward zero), complex soft-thresholding shrinks the complex magnitude while keeping the argument unchanged.

Proof

Wirtinger gradient of the data fidelity

$\nabla_{\mathbf{c}} f = \mathbf{A}^{H}(\mathbf{A}\mathbf{c} - \mathbf{y})$ — identical to the real case via Wirtinger calculus.

Proximal operator derivation

The proximal operator of $\lambda\|\cdot\|_1$ for complex $z$ is $\text{prox}_{\lambda|\cdot|}(z)$ . Writing $z = |z|e^{j\phi}$ , the minimizer of $\frac{1}{2}|u - z|^2 + \lambda|u|$ has $u = e^{j\phi}\max(|z| - \lambda, 0)$ , yielding the stated formula.

Phase preservation

Since $\mathcal{S}_{\lambda}(z) = z(1 - \lambda/|z|)$ for $|z| > \lambda$ , the phase $\arg(\mathcal{S}_{\lambda}(z)) = \arg(z)$ is preserved exactly.

Definition:
Regularization Parameter Selection for RF Imaging

The choice of $\lambda$ critically affects reconstruction quality. Three principled selection strategies:

1. Discrepancy principle (known $\sigma$ ): Choose $\lambda$ such that $\|\mathbf{A}\hat{\mathbf{c}} - \mathbf{y}\|_2 = \sqrt{M}\sigma$ . This matches the residual to the expected noise level.

2. Cross-validation (unknown $\sigma$ ): Hold out a random subset of measurements $\mathcal{I}$ , solve on $\mathcal{I}^c$ , evaluate the prediction error on $\mathcal{I}$ . Repeat for $K = 5$ folds and select $\lambda$ minimizing the average prediction error.

3. SURE (Stein's Unbiased Risk Estimate): For the LASSO with Gaussian noise, $\text{SURE}(\lambda) = \|\mathbf{A}\hat{\mathbf{c}} - \mathbf{y}\|_2^2 - M\sigma^2 + 2\sigma^2\,\text{df}(\lambda)$ , where $\text{df}(\lambda) = |\text{supp}(\hat{\mathbf{c}})|$ is the number of nonzero entries.

For radar imaging: The discrepancy principle is most reliable because the noise level $\sigma$ is typically known from the receiver noise figure.

,

Definition:
Debiasing on the Estimated Support

The LASSO solution $\hat{\mathbf{c}}_{\text{LASSO}}$ is biased: the $\ell_1$ penalty shrinks all nonzero amplitudes toward zero.

Debiasing procedure:

Estimate the support $\hat{S} = \{n : |\hat{c}_n| > \epsilon_{\text{thr}}\}$ .
Solve the least-squares problem restricted to $\hat{S}$ :

$\hat{\mathbf{c}}_{\text{debias}} = \arg\min_{\mathbf{c}:\,\text{supp}(\mathbf{c}) \subseteq \hat{S}} \|\mathbf{A}\mathbf{c} - \mathbf{y}\|_2^2.$

This is equivalent to $\hat{\mathbf{c}}_{\hat{S}} = \mathbf{A}_{\hat{S}}^\dagger\mathbf{y}$ where $\mathbf{A}_{\hat{S}}$ contains only the columns indexed by $\hat{S}$ .

When to debias: Always for amplitude-critical applications (radar cross-section estimation, quantitative imaging). Not needed for pure detection (support recovery).

LASSO Reconstruction of Point Scatterers

Demonstrates LASSO reconstruction for an RF imaging scene with point scatterers. Vary $\lambda$ to observe the bias-variance tradeoff: too large kills weak scatterers (over-regularization); too small retains noise (under-regularization).

Left: True scene and LASSO reconstruction. Right: Support recovery and amplitude error vs. $\lambda$ .

Parameters

\lambda

0.05

SNR (dB)20

Number of scatterers8

Example: LASSO Reconstruction for Compressed SAR

Setup: Stripmap SAR with 50% random frequency subsampling. $N = 128 \times 128 = 16384$ pixels, $M = 8192$ measurements. Scene: 25 point scatterers + 5 extended targets (buildings).

Results:

Method	NMSE (dB)	SSIM	Time (s)
Matched filter	$-6.3$	0.42	0.01
LASSO (FISTA, 100 iter)	$-18.7$	0.89	2.1
LASSO (ADMM, 40 iter)	$-19.1$	0.90	2.5
BP (SOCP)	$-19.3$	0.91	180

FISTA and ADMM achieve nearly the same quality as BP at $\sim$ 100 $\times$ lower computational cost. The matched filter provides a quick preview but with significant sidelobe artifacts.

Solution

Key observation

FISTA and ADMM reach near-BP quality at a fraction of the cost because each iteration only requires two matrix-vector products ( $\mathbf{A}\mathbf{c}$ and $\mathbf{A}^{H}\mathbf{r}$ ), which exploit Kronecker structure for $O(N_x N_y \log N_f)$ cost.

Debiasing effect

Applying debiasing to the FISTA solution improves NMSE from $-18.7$ to $-20.1$ dB by removing the $\ell_1$ shrinkage bias.

FISTA vs. ADMM vs. Coordinate Descent for Imaging

The imaging-specific sensing matrix $\mathbf{A}$ has structure (Kronecker, partial DFT) that makes some solvers more natural:

Solver	Cost/iter	Best when
FISTA	$2 \times$ matvec	$\mathbf{A}$ has fast matvec (Kronecker/FFT)
ADMM	Linear system + prox	Composite penalties ( $\ell_1$ + TV)
Coordinate descent	$O(M)$ per coordinate	$\mathbf{A}$ stored explicitly, moderate $N$

For RF imaging with Kronecker $\mathbf{A}$ , FISTA is typically fastest for the pure LASSO. ADMM dominates for composite penalties because FISTA would require an inner proximal loop. Coordinate descent is rarely used because the matvec with $\mathbf{A}$ does not decompose coordinate-wise.

Common Mistake: $\lambda$ Selection Pitfalls in RF Imaging

Mistake:

Setting $\lambda$ by visual inspection or by using the universal threshold $\lambda = \sigma\sqrt{2\log N}$ without accounting for the sensing matrix structure. The universal threshold assumes an orthonormal $\mathbf{A}$ , but the RF sensing matrix has non-uniform column norms and high coherence, so the effective threshold is different.

Correction:

Use the discrepancy principle when $\sigma$ is known (standard in radar). If $\sigma$ is unknown, use 5-fold cross-validation. Always validate the chosen $\lambda$ by inspecting the residual image $\mathbf{A}\hat{\mathbf{c}} - \mathbf{y}$ for systematic structure (which indicates under-regularization).

Choosing the Sparsifying Basis for RF Imaging

The LASSO assumes sparsity in some domain. For RF imaging:

Basis $\boldsymbol{\Psi}$	Sparse for	LASSO becomes
Identity	Point scatterers	$\min\frac{1}{2}\\|\mathbf{A}\mathbf{c}-\mathbf{y}\\|^2 + \lambda\\|\mathbf{c}\\|_1$
Wavelet	Natural scenes with edges	$\min\frac{1}{2}\\|\mathbf{A}\boldsymbol{\Psi}\boldsymbol{\theta}-\mathbf{y}\\|^2 + \lambda\\|\boldsymbol{\theta}\\|_1$
DCT	Smooth scenes	Same form with DCT $\boldsymbol{\Psi}$
Learned dictionary	Complex scenes	Requires dictionary learning

Practical advice:

Start with the identity (canonical sparsity) for scenes dominated by point scatterers.
Use wavelets for scenes with extended targets.
Use TV (analysis sparsity, §Total Variation Reconstruction) for piecewise-constant scenes.
For mixed scenes, combine $\ell_1$ + TV via ADMM.

Quick Check

After solving the LASSO for an RF imaging problem, you obtain $\hat{\mathbf{c}}_{\text{LASSO}}$ . Why might the estimated scatterer amplitudes be systematically too small?

The $\ell_1$ penalty shrinks all nonzero entries toward zero (shrinkage bias).

The sensing matrix $\mathbf{A}$ has too few rows.

The noise level is too high.

The grid resolution is too coarse.

Correction:

The

\ell_1

penalty shrinks all nonzero entries toward zero (shrinkage bias).

The $\ell_1$ penalty introduces a bias: every nonzero amplitude is reduced by approximately $\lambda$ . Debiasing by least-squares on the detected support removes this effect.

Historical Note: From Statistics to Radar Imaging

1996--2006

The LASSO was introduced by Robert Tibshirani in 1996 as a statistical regularization method for linear regression. Its adoption in signal processing and imaging was catalyzed by the compressed sensing revolution (Candes, Romberg, Tao, 2006; Donoho, 2006), which provided the theoretical guarantees showing that $\ell_1$ minimization could exactly recover sparse signals from incomplete measurements. Cetin and Karl (2001) were among the first to apply sparsity-promoting regularization to SAR imaging, demonstrating dramatic improvements over matched filter reconstruction.

,

LASSO

Least Absolute Shrinkage and Selection Operator: the penalized least-squares problem $\min \frac{1}{2}\|\mathbf{A}\mathbf{c} - \mathbf{y}\|_2^2 + \lambda\|\mathbf{c}\|_1$ . Promotes sparsity through the $\ell_1$ penalty.

Related: Basis Pursuit, FISTA

Basis Pursuit

The $\ell_1$ minimization problem under equality constraints: $\min \|\mathbf{c}\|_1$ s.t. $\mathbf{A}\mathbf{c} = \mathbf{y}$ . Equivalent to the LASSO with $\lambda \to 0$ in the noiseless case.

Related: LASSO

FISTA

Fast Iterative Shrinkage-Thresholding Algorithm: Nesterov- accelerated proximal gradient for the LASSO, achieving $O(1/t^2)$ convergence vs. $O(1/t)$ for ISTA.

Related: LASSO

Debiasing

Post-processing step that removes the $\ell_1$ shrinkage bias by solving least-squares restricted to the estimated support.

Related: LASSO

Discrepancy Principle

Regularization parameter selection rule that chooses $\lambda$ so the residual norm matches the expected noise level: $\|\mathbf{A}\hat{\mathbf{c}} - \mathbf{y}\|_2 \approx \sqrt{M}\sigma$ .

Key Takeaway

Basis Pursuit is the gold standard for noiseless recovery but too expensive ( $O(N^3)$ ) for large-scale imaging. The LASSO solved by FISTA or ADMM achieves near-BP quality at orders-of-magnitude lower cost. Complex soft-thresholding preserves phase, essential for coherent RF imaging. The discrepancy principle is the recommended strategy for choosing $\lambda$ in radar imaging where the noise level is known. Always debias on the detected support when amplitude accuracy matters.

Why This Matters: LASSO in Wireless Channel Estimation

The same LASSO formulation used for RF image reconstruction appears in mmWave and sub-THz channel estimation (Telecom Ch 35), where the channel is sparse in the angle-delay domain. The sensing matrix is a partial DFT (pilot tones), and the LASSO recovers the channel taps. The debiasing step is equally important there to obtain accurate channel gain estimates for beamforming.

See full treatment in Chapter 35

LASSO and Basis Pursuit for Imaging

LASSO and Basis Pursuit for RF Imaging

Definition: Basis Pursuit for Noiseless RF Imaging

Definition: LASSO and BPDN for Noisy RF Imaging

Theorem: Complex Soft-Thresholding for RF Imaging

Wirtinger gradient of the data fidelity

Proximal operator derivation

Phase preservation

Definition: Regularization Parameter Selection for RF Imaging

Definition: Debiasing on the Estimated Support

LASSO Reconstruction of Point Scatterers

Parameters

Example: LASSO Reconstruction for Compressed SAR

Key observation

Debiasing effect

FISTA vs. ADMM vs. Coordinate Descent for Imaging

Common Mistake: λ\lambdaλ Selection Pitfalls in RF Imaging

Choosing the Sparsifying Basis for RF Imaging

Quick Check

Historical Note: From Statistics to Radar Imaging

LASSO

Basis Pursuit

FISTA

Debiasing

Discrepancy Principle

Key Takeaway

Why This Matters: LASSO in Wireless Channel Estimation

Definition:
Basis Pursuit for Noiseless RF Imaging

Definition:
LASSO and BPDN for Noisy RF Imaging

Definition:
Regularization Parameter Selection for RF Imaging

Definition:
Debiasing on the Estimated Support

Common Mistake: $\lambda$ Selection Pitfalls in RF Imaging