Ferkans — Interactive Telecom Tutor

Can We Beat the $\tau_p \geq N$ Barrier?

The least-squares lower bound $\tau_p \geq N$ holds for a general channel with $N N_t$ unknowns. But most practical RIS channels are not general — they are sparse in the angular domain. In mmWave and sub-THz, the BS-RIS and RIS-UE links consist of a few dominant specular paths (LoS + maybe one or two reflections); there are far fewer "significant" angles than there are RIS elements. If we exploit this sparsity, compressed sensing lets us reduce the pilot count from $\tau_p = \mathcal{O}(N)$ to $\tau_p = \mathcal{O}(L \log N)$ — an exponential-in- $N$ reduction when $L \ll N$ .

Definition:
Angular-Domain Sparse Cascaded Channel

Let $\mathbf{A}_N \in \mathbb{C}^{N \times N}$ and $\mathbf{A}_{N_t} \in \mathbb{C}^{N_t \times N_t}$ be DFT (or overcomplete) dictionaries for the RIS and BS apertures. Under the sparse-scattering model, the cascaded channel admits the factorization

$\mathbf{G} = \mathbf{A}_N\, \mathbf{X}\, \mathbf{A}_{N_t}^H,$

where the sparse coefficient matrix $\mathbf{X} \in \mathbb{C}^{N \times N_t}$ has at most $L$ nonzero entries — one per (BS angle, RIS angle) scattering pair. Concretely, if the BS-RIS channel has $L_1$ paths and the RIS-UE channel has $L_2$ paths, then $\mathbf{X}$ has at most $L = L_1 L_2$ nonzeros.

The sparsity level $L$ is typically $\leq 10$ at mmWave and $\leq 4$ at sub-THz, compared with $N \geq 256$ . The savings from exploiting sparsity are substantial.

The dictionary choice matters: a DFT dictionary assumes grid-matched angles (on-grid). In practice, angles are off-grid, and an overcomplete or parametric dictionary is preferred. Methods like ANM (atomic-norm minimization) handle the off-grid case directly; see Section 4.4 of Wang et al. (2020) for details.

,

Theorem: Pilot-Count Scaling for Sparse Recovery

Suppose $\mathbf{G}$ has the sparse representation of Def. 4.5 with sparsity $L$ . Suppose the RIS pilot configurations $\boldsymbol{\phi}^{(t)}$ are drawn independently from a symmetric, unit-modulus distribution (e.g., uniform on the complex unit circle). Then with probability $\geq 1 - N^{-c}$ (for some constant $c > 0$ ), the $\ell_1$ -regularized estimator

$\hat{\mathbf{X}} = \arg\min_{\mathbf{X}'} \|\mathbf{X}'\|_1 \quad \text{s.t.} \quad \|\mathbf{Y} - \mathbf{G}(\mathbf{X}') \boldsymbol{\Phi}^{\text{stack}}\|_F \leq \epsilon$

recovers $\mathbf{X}$ exactly (up to noise) whenever

$\tau_p \geq C\, \frac{L \log(N)}{N_t},$

for a universal constant $C$ depending on the RIP constant.

Compressed-sensing theory says: with a random measurement matrix satisfying the restricted isometry property (RIP) and sparsity level $L$ , successful recovery requires $m = \mathcal{O}(L \log(N/L))$ measurements. For the RIS problem, the "measurement matrix" is built from random RIS configurations, and each pilot slot provides $N_t$ measurements. The pilot length scales as $\tau_p = \mathcal{O}(L \log N / N_t)$ , dramatically less than the naive $\tau_p = N$ .

Proof

Restricted isometry of random RIS pilots

Random unit-modulus $\boldsymbol{\phi}^{(t)}$ drawn from the uniform distribution on the complex torus satisfy the RIP with constant $\delta_L$ when $\tau_p = \Omega(L \log(N/L))$ — this is a standard result in CS for random bounded designs (see Candes and Tao 2005, Foucart and Rauhut 2013).

LASSO recovery guarantee

Under RIP with $\delta_{2L} < \sqrt{2} - 1$ , $\ell_1$ minimization recovers the sparse vector $\mathbf{X}$ from $\mathbf{Y} = \text{vec}(\mathbf{G})\boldsymbol{\Phi}^{\text{stack}} + \mathbf{W}$ with error $\|\hat{\mathbf{X}} - \mathbf{X}\| \leq C \epsilon$ (Candes 2008).

Dividing by parallel BS streams

Each pilot slot gives $N_t$ independent measurement equations (one per BS antenna). So the effective measurement count is $m = \tau_p N_t$ , requiring $\tau_p N_t \geq C L \log(N)$ , i.e., $\tau_p \geq C L \log(N)/N_t$ . $\blacksquare$

Compressed-Sensing RIS Channel Estimation

Complexity:

O(\tau_p N_t \cdot N N_t)

per iteration;

O(\log(1/\epsilon))

iterations to

\epsilon

-precision

Input: Pilot observations

\mathbf{Y} \in \mathbb{C}^{N_t \times \tau_p}

,

RIS configurations

\boldsymbol{\Phi}^{\text{stack}}

, angular dictionaries

\mathbf{A}_N, \mathbf{A}_{N_t}

, regularization

\lambda > 0

.

Output: estimated cascaded channel

\hat{\mathbf{G}}

.

1. Build the sensing matrix

\mathbf{A}_{\text{CS}} = (\boldsymbol{\Phi}^{\text{stack}})^T \mathbf{A}_N \otimes \mathbf{A}_{N_t}^H

of size

\tau_p N_t \times N N_t

.

2. Vectorize observations:

\mathbf{y} = \text{vec}(\mathbf{Y})

.

3. Solve the LASSO:

\hat{\mathbf{x}} = \arg\min_{\mathbf{x}} \frac{1}{2}\|\mathbf{y} - \mathbf{A}_{\text{CS}}\mathbf{x}\|^2 + \lambda \|\mathbf{x}\|_1

using FISTA, ISTA, or OMP.

4. Reshape

\hat{\mathbf{x}}

to

\hat{\mathbf{X}} \in \mathbb{C}^{N \times N_t}

.

5. return

\hat{\mathbf{G}} = \mathbf{A}_N \hat{\mathbf{X}} \mathbf{A}_{N_t}^H

.

The bottleneck is the LASSO solve, which is $\mathcal{O}(N N_t)$ per iteration. For very large $N$ , the Kronecker structure of $\mathbf{A}_{\text{CS}}$ allows $\mathcal{O}(N N_t \log N)$ per iteration via FFT-based matrix-vector products. Greedy methods like OMP are often faster in practice for small $L$ .

Example: Overhead Savings at $N = 256$ , $L = 4$

An $N = 256$ -element RIS operates in a mmWave scenario with $L = 4$ dominant scatterers. BS has $N_t = 16$ antennas. Compare the pilot overhead of DFT codebook vs. compressed sensing.

Solution

DFT codebook overhead

$\tau_p^{\text{DFT}} = N = 256$ pilot slots. In a $T = 500$ symbol coherence block, $256/500 = 51.2\%$ of the block is pilots.

CS overhead

With $C \approx 2$ (conservative constant) and $\tau_p \geq C L \log(N) / N_t$ : $\tau_p^{\text{CS}} = 2 \cdot 4 \cdot \log(256) / 16 = 8 \cdot 8 / 16 = 4$ slots (rounding up). In practice, one needs a safety factor; typical deployments use $\tau_p \sim 4 L \log N$ without the $N_t$ denominator, giving $\tau_p \approx 128$ . Even so, a $2\times$ savings over DFT.

Tradeoff

CS overhead is smaller ( $\sim 128$ vs. $256$ ), but solving the LASSO at inference time requires perhaps $\sim 10\,\text{ms}$ of compute — acceptable offline but tight for real-time update. DFT trades pilot time for compute simplicity; CS trades compute for pilot efficiency.

CS Accuracy vs. Pilot Count

Sweep pilot count $\tau_p$ and measure the LASSO estimation NMSE. At $\tau_p \sim L \log N / N_t$ the NMSE drops sharply (the "phase transition"). Below this threshold, recovery is impossible; above it, NMSE is limited by noise only.

Parameters

RIS elements

N

128

Sparsity

L

4

Pilot SNR (dB)15

Common Mistake: On-Grid Dictionaries Are Inaccurate

Mistake:

"Use the $N$ -point DFT as the angular dictionary — it covers all angles."

Correction:

The DFT dictionary samples $N$ angles on a uniform grid. Real scatterer angles are continuous and almost never fall exactly on the grid. An off-grid angle $\theta$ spills its energy into multiple neighbouring dictionary atoms, breaking sparsity and degrading the LASSO recovery. Solutions: (i) overcomplete dictionary with $\gamma N$ angles ( $\gamma = 2, 4$ ); (ii) atomic-norm minimization (ANM) which works in the continuous angular domain directly; (iii) Newtonian refinement after a coarse on-grid solve. Without one of these, expect ~3 dB of unnecessary MSE.

Why CS Favors Certain RIS Hardware

Compressed sensing shines when the RIS can realize random unit-modulus configurations — ideally i.i.d. uniform on the complex unit circle. This is easy for continuous-phase (varactor + DAC) hardware. For 1-bit RIS, the random pattern is Bernoulli $\pm 1$ , which is still good for CS (it has the RIP with high probability) but slightly worse constants. For 2-3 bit RIS, the effective codebook is small but the random sampling still produces a well-conditioned sensing matrix. In general, higher phase resolution helps CS more than it helps DFT — a second-order argument in favor of continuous-phase hardware for CSI-limited scenarios.

🎓CommIT Contribution(2022)

Hierarchical Codebook Search for Array-Fed RIS

G. Caire, I. Atzeni — IEEE Trans. Wireless Commun. (preprint)

In the array-fed RIS architecture (Chapter 11), the BS-RIS channel $\mathbf{H}_1$ is a short-range near-field link, while the RIS-UE channel $\mathbf{h}_2$ is long-range and potentially sparse. Caire and collaborators develop a hierarchical codebook search that exploits this asymmetry: the BS-RIS channel is estimated once via a high-resolution pilot design (because it is slowly varying), while the UE-specific RIS-UE channels are estimated at the coherence rate using compressed-sensing on a reduced subspace carved out of $\mathbf{H}_1$ 's eigenspace. The scheme achieves near-DFT accuracy with $\sim \mathcal{O}(K L \log N)$ pilots for $K$ users — an order-of-magnitude saving for large $K, N$ . This is the channel-estimation foundation of the multi-user array-fed RIS architecture.

array-fed-riscompressed-sensingmulti-user

Compressed Sensing with Angular Sparsity

Can We Beat the τp≥N\tau_p \geq Nτp​≥N Barrier?

Definition: Angular-Domain Sparse Cascaded Channel