Compressed Sensing with Angular Sparsity

Can We Beat the Ο„pβ‰₯N\tau_p \geq N Barrier?

The least-squares lower bound Ο„pβ‰₯N\tau_p \geq N holds for a general channel with NNtN N_t unknowns. But most practical RIS channels are not general β€” they are sparse in the angular domain. In mmWave and sub-THz, the BS-RIS and RIS-UE links consist of a few dominant specular paths (LoS + maybe one or two reflections); there are far fewer "significant" angles than there are RIS elements. If we exploit this sparsity, compressed sensing lets us reduce the pilot count from Ο„p=O(N)\tau_p = \mathcal{O}(N) to Ο„p=O(Llog⁑N)\tau_p = \mathcal{O}(L \log N) β€” an exponential-in-NN reduction when Lβ‰ͺNL \ll N.

Definition:

Angular-Domain Sparse Cascaded Channel

Let AN∈CNΓ—N\mathbf{A}_N \in \mathbb{C}^{N \times N} and ANt∈CNtΓ—Nt\mathbf{A}_{N_t} \in \mathbb{C}^{N_t \times N_t} be DFT (or overcomplete) dictionaries for the RIS and BS apertures. Under the sparse-scattering model, the cascaded channel admits the factorization

G=AN X ANtH,\mathbf{G} = \mathbf{A}_N\, \mathbf{X}\, \mathbf{A}_{N_t}^H,

where the sparse coefficient matrix X∈CNΓ—Nt\mathbf{X} \in \mathbb{C}^{N \times N_t} has at most LL nonzero entries β€” one per (BS angle, RIS angle) scattering pair. Concretely, if the BS-RIS channel has L1L_1 paths and the RIS-UE channel has L2L_2 paths, then X\mathbf{X} has at most L=L1L2L = L_1 L_2 nonzeros.

The sparsity level LL is typically ≀10\leq 10 at mmWave and ≀4\leq 4 at sub-THz, compared with Nβ‰₯256N \geq 256. The savings from exploiting sparsity are substantial.

The dictionary choice matters: a DFT dictionary assumes grid-matched angles (on-grid). In practice, angles are off-grid, and an overcomplete or parametric dictionary is preferred. Methods like ANM (atomic-norm minimization) handle the off-grid case directly; see Section 4.4 of Wang et al. (2020) for details.

,

Theorem: Pilot-Count Scaling for Sparse Recovery

Suppose G\mathbf{G} has the sparse representation of Def. 4.5 with sparsity LL. Suppose the RIS pilot configurations Ο•(t)\boldsymbol{\phi}^{(t)} are drawn independently from a symmetric, unit-modulus distribution (e.g., uniform on the complex unit circle). Then with probability β‰₯1βˆ’Nβˆ’c\geq 1 - N^{-c} (for some constant c>0c > 0), the β„“1\ell_1-regularized estimator

X^=arg⁑min⁑Xβ€²βˆ₯Xβ€²βˆ₯1s.t.βˆ₯Yβˆ’G(Xβ€²)Ξ¦stackβˆ₯F≀ϡ\hat{\mathbf{X}} = \arg\min_{\mathbf{X}'} \|\mathbf{X}'\|_1 \quad \text{s.t.} \quad \|\mathbf{Y} - \mathbf{G}(\mathbf{X}') \boldsymbol{\Phi}^{\text{stack}}\|_F \leq \epsilon

recovers X\mathbf{X} exactly (up to noise) whenever

Ο„pβ‰₯C Llog⁑(N)Nt,\tau_p \geq C\, \frac{L \log(N)}{N_t},

for a universal constant CC depending on the RIP constant.

Compressed-sensing theory says: with a random measurement matrix satisfying the restricted isometry property (RIP) and sparsity level LL, successful recovery requires m=O(Llog⁑(N/L))m = \mathcal{O}(L \log(N/L)) measurements. For the RIS problem, the "measurement matrix" is built from random RIS configurations, and each pilot slot provides NtN_t measurements. The pilot length scales as Ο„p=O(Llog⁑N/Nt)\tau_p = \mathcal{O}(L \log N / N_t), dramatically less than the naive Ο„p=N\tau_p = N.

Compressed-Sensing RIS Channel Estimation

Complexity: O(Ο„pNtβ‹…NNt)O(\tau_p N_t \cdot N N_t) per iteration; O(log⁑(1/Ο΅))O(\log(1/\epsilon)) iterations to Ο΅\epsilon-precision
Input: Pilot observations Y∈CNtΓ—Ο„p\mathbf{Y} \in \mathbb{C}^{N_t \times \tau_p},
RIS configurations Ξ¦stack\boldsymbol{\Phi}^{\text{stack}}, angular dictionaries
AN,ANt\mathbf{A}_N, \mathbf{A}_{N_t}, regularization Ξ»>0\lambda > 0.
Output: estimated cascaded channel G^\hat{\mathbf{G}}.
1. Build the sensing matrix ACS=(Ξ¦stack)TANβŠ—ANtH\mathbf{A}_{\text{CS}} = (\boldsymbol{\Phi}^{\text{stack}})^T \mathbf{A}_N \otimes \mathbf{A}_{N_t}^H
of size Ο„pNtΓ—NNt\tau_p N_t \times N N_t.
2. Vectorize observations: y=vec(Y)\mathbf{y} = \text{vec}(\mathbf{Y}).
3. Solve the LASSO:
x^=arg⁑min⁑x12βˆ₯yβˆ’ACSxβˆ₯2+Ξ»βˆ₯xβˆ₯1\hat{\mathbf{x}} = \arg\min_{\mathbf{x}} \frac{1}{2}\|\mathbf{y} - \mathbf{A}_{\text{CS}}\mathbf{x}\|^2 + \lambda \|\mathbf{x}\|_1
using FISTA, ISTA, or OMP.
4. Reshape x^\hat{\mathbf{x}} to X^∈CNΓ—Nt\hat{\mathbf{X}} \in \mathbb{C}^{N \times N_t}.
5. return G^=ANX^ANtH\hat{\mathbf{G}} = \mathbf{A}_N \hat{\mathbf{X}} \mathbf{A}_{N_t}^H.

The bottleneck is the LASSO solve, which is O(NNt)\mathcal{O}(N N_t) per iteration. For very large NN, the Kronecker structure of ACS\mathbf{A}_{\text{CS}} allows O(NNtlog⁑N)\mathcal{O}(N N_t \log N) per iteration via FFT-based matrix-vector products. Greedy methods like OMP are often faster in practice for small LL.

Example: Overhead Savings at N=256N = 256, L=4L = 4

An N=256N = 256-element RIS operates in a mmWave scenario with L=4L = 4 dominant scatterers. BS has Nt=16N_t = 16 antennas. Compare the pilot overhead of DFT codebook vs. compressed sensing.

CS Accuracy vs. Pilot Count

Sweep pilot count Ο„p\tau_p and measure the LASSO estimation NMSE. At Ο„p∼Llog⁑N/Nt\tau_p \sim L \log N / N_t the NMSE drops sharply (the "phase transition"). Below this threshold, recovery is impossible; above it, NMSE is limited by noise only.

Parameters
128
4
15

Common Mistake: On-Grid Dictionaries Are Inaccurate

Mistake:

"Use the NN-point DFT as the angular dictionary β€” it covers all angles."

Correction:

The DFT dictionary samples NN angles on a uniform grid. Real scatterer angles are continuous and almost never fall exactly on the grid. An off-grid angle ΞΈ\theta spills its energy into multiple neighbouring dictionary atoms, breaking sparsity and degrading the LASSO recovery. Solutions: (i) overcomplete dictionary with Ξ³N\gamma N angles (Ξ³=2,4\gamma = 2, 4); (ii) atomic-norm minimization (ANM) which works in the continuous angular domain directly; (iii) Newtonian refinement after a coarse on-grid solve. Without one of these, expect ~3 dB of unnecessary MSE.

Why CS Favors Certain RIS Hardware

Compressed sensing shines when the RIS can realize random unit-modulus configurations β€” ideally i.i.d. uniform on the complex unit circle. This is easy for continuous-phase (varactor + DAC) hardware. For 1-bit RIS, the random pattern is Bernoulli Β±1\pm 1, which is still good for CS (it has the RIP with high probability) but slightly worse constants. For 2-3 bit RIS, the effective codebook is small but the random sampling still produces a well-conditioned sensing matrix. In general, higher phase resolution helps CS more than it helps DFT β€” a second-order argument in favor of continuous-phase hardware for CSI-limited scenarios.

πŸŽ“CommIT Contribution(2022)

Hierarchical Codebook Search for Array-Fed RIS

G. Caire, I. Atzeni β€” IEEE Trans. Wireless Commun. (preprint)

In the array-fed RIS architecture (Chapter 11), the BS-RIS channel H1\mathbf{H}_1 is a short-range near-field link, while the RIS-UE channel h2\mathbf{h}_2 is long-range and potentially sparse. Caire and collaborators develop a hierarchical codebook search that exploits this asymmetry: the BS-RIS channel is estimated once via a high-resolution pilot design (because it is slowly varying), while the UE-specific RIS-UE channels are estimated at the coherence rate using compressed-sensing on a reduced subspace carved out of H1\mathbf{H}_1's eigenspace. The scheme achieves near-DFT accuracy with ∼O(KLlog⁑N)\sim \mathcal{O}(K L \log N) pilots for KK users β€” an order-of-magnitude saving for large K,NK, N. This is the channel-estimation foundation of the multi-user array-fed RIS architecture.

array-fed-riscompressed-sensingmulti-user