Ferkans — Interactive Telecom Tutor

The Massive-Access Bottleneck

Modern IoT and massive-machine-type communication (mMTC) scenarios involve a huge population of potential devices — think $K_{\text{total}} \sim 10^4$ — of which only a tiny fraction $K_a \ll K_{\text{total}}$ is active at any given time. Classical access protocols rely on handshakes, requests, and grants. With $10^4$ devices, this overhead is prohibitive: the coordination itself consumes more airtime than the data. The point is that "who is transmitting?" is a sparse-recovery question. If we give each device a unique pilot sequence and observe a single superimposed uplink, the unknown activity vector is $K_a$ -sparse in $K_{\text{total}}$ dimensions, and the compressed-sensing machinery identifies the active set from $M \sim K_a \log(K_{\text{total}}/K_a)$ channel uses.

Definition:
Activity Vector and Massive-Access Model

Let $K_{\text{total}}$ denote the population of potential users and, at a given coherence block, let $K_a$ of them transmit. Assign to user $k$ the unit-norm pilot signature $\boldsymbol{\phi}_k \in \mathbb{C}^M$ and collect them as columns of $\boldsymbol{\Phi} \in \mathbb{C}^{M \times K_{\text{total}}}$ . Let $x_k \in \mathbb{C}$ encode user $k$ 's transmitted symbol (zero if inactive). The base station observes $\mathbf{y} = \boldsymbol{\Phi}\,\mathbf{x} + \mathbf{w},\qquad \mathbf{w} \sim \mathcal{CN}(\mathbf{0}, \sigma^2\, \mathbf{I}_M).$ The activity vector $\mathbf{x}$ is $K_a$ -sparse, and activity detection is the problem of recovering its support $\mathcal{S} = \{k : x_k \neq 0\}$ .

When users are additionally equipped with $N_r$ receive antennas, the model becomes a multiple-measurement vector (MMV) problem: $\mathbf{Y} = \boldsymbol{\Phi}\mathbf{X} + \mathbf{W}$ with $\mathbf{X}$ row-sparse — the same users are active across all receive antennas.

Theorem: Sample Complexity of Activity Detection

Suppose pilots $\boldsymbol{\phi}_k$ are i.i.d.\ $\mathcal{CN}(\mathbf{0}, \mathbf{I}_M/M)$ and active symbols have power bounded below by $P_{\min}$ . There is a constant $c > 0$ such that if $M \geq c\, K_a \log\!\left(\frac{K_{\text{total}}}{K_a}\right) \cdot \frac{\sigma^2}{P_{\min}},$ then the LASSO or non-negative LS (NNLS) activity detector correctly identifies $\mathcal{S}$ with probability $\geq 1 - K_{\text{total}}^{-1}$ .

Pilot length scales logarithmically in $K_{\text{total}}$ and linearly in $K_a$ , plus a standard inverse-SNR factor. For $K_{\text{total}} = 10^4$ and $K_a = 100$ , this is $M \sim 100 \log 100 \approx 460$ channel uses, versus $K_{\text{total}} = 10^4$ for a deterministic orthogonal scheme — a roughly $20\times$ saving.

Proof

Reduce to support recovery

Consistent support recovery for LASSO requires the irrepresentable condition plus a minimum-signal-strength scaling $|x_k| \gtrsim \sqrt{\sigma^2 \log K_{\text{total}}/M}$ (Wainwright, 2009).

Gaussian pilots satisfy IRC

Wainwright showed that random Gaussian pilots satisfy the irrepresentable condition with high probability whenever $M \geq c\, K_a \log(K_{\text{total}}/K_a)$ .

Combine minimum-signal and IRC bounds

Inserting $|x_k|^2 \geq P_{\min}$ into the signal-strength requirement and taking the stricter of the two scalings yields the stated bound. $\blacksquare$

,

🎓CommIT Contribution(2021)

Unsourced Random Access (Fengler-Haghighatshoar-Jung-Caire)

A. Fengler, S. Haghighatshoar, P. Jung, G. Caire — IEEE Transactions on Information Theory, vol. 67, no. 5, pp. 2925-2951

In unsourced random access the base station only wants to recover the list of transmitted messages, not who sent them. Fengler, Haghighatshoar, Jung, and Caire showed that this problem maps onto a massive-MIMO activity-detection problem where the "users" are codewords. They proved that a covariance-based detector combined with a large-scale-fading estimator achieves the information-theoretic scaling laws of Polyanskiy's finite-blocklength bounds, without needing per-user identification. This work is one of the cornerstones of the CommIT group's research line on massive access and directly motivates 3GPP's Release-17 RedCap and ambient-IoT study items.

unsourced-random-accessmassive-mimomMTCcommitView Paper →

🎓CommIT Contribution(2022)

Coded Compressed Sensing for Unsourced MAC

V. K. Amalladinne, A. K. Pradhan, C. Rush, J.-F. Chamberland, K. R. Narayanan, G. Caire — IEEE Transactions on Information Theory, vol. 68, no. 4, pp. 2384-2409

Coded compressed sensing splits long messages into chunks, each mapped to a CS codeword, and stitches the chunks together via an outer tree code. This reduces the per-slot dictionary size from $2^B$ to $2^{B/L}$ , making CS computationally feasible even for modest payloads. The analysis — a joint AMP + belief-propagation decoder — is a CommIT-driven direction that fuses CS recovery with coding theory. Chapter 15 treats this as the canonical scalable architecture for unsourced access.

coded-csunsourcedampcommitView Paper →

Covariance-Based Activity Detection (Non-Bayesian)

Complexity:

O(K_{ ext{total}} M^2)

per coordinate-descent sweep; converges in tens of sweeps.

Input: Pilot book

\boldsymbol{\Phi} \in \mathbb{C}^{M \times K_{\text{total}}}

, multi-antenna observation

\mathbf{Y} \in \mathbb{C}^{M \times N_r}

Output: Estimated activity/LSFC vector

\hat{\boldsymbol{\gamma}} \in \mathbb{R}_+^{K_{\text{total}}}

1. Form sample covariance

\widehat{\boldsymbol{\Sigma}}_y \leftarrow \tfrac{1}{N_r} \mathbf{Y}\mathbf{Y}^H

2. Model

\boldsymbol{\Sigma}_y = \sum_k \gamma_k \boldsymbol{\phi}_k \boldsymbol{\phi}_k^H + \sigma^2\mathbf{I}_M

3. Solve ML:

\hat{\boldsymbol{\gamma}} \leftarrow \arg\max_{\boldsymbol{\gamma} \geq 0}\ -\log\det \boldsymbol{\Sigma}_y(\boldsymbol{\gamma}) - \mathrm{tr}(\boldsymbol{\Sigma}_y(\boldsymbol{\gamma})^{-1} \widehat{\boldsymbol{\Sigma}}_y)

4. Use coordinate descent: at each step update one

\gamma_k

via the closed-form quadratic root

5. Threshold

\hat{\gamma}_k

to declare activity

The covariance-based detector does not reconstruct $\mathbf{X}$ ; it estimates only the large-scale fading power $\gamma_k$ of each user. As $N_r \to \infty$ the estimator is consistent even when $K_a > M$ — the regime where $\ell_1$ -based MMV methods fail.

Activity-Detection ROC

Sweep $K_a$ , pilot length $M$ , and $K_{\text{total}}$ and watch the ROC move. When $M$ falls below the threshold in the theorem above, the curve collapses onto the diagonal — activity detection fails.

Parameters

K_{\text{total}}

200

Active users

K_a

10

Pilot length

M

40

SNR (dB)10

Many Users, Few Active: Activity Detection

Animated view of

K_{\text{total}}

potential users collapsing onto a short pilot observation and the sparse-recovery "lens" revealing only the

K_a

active ones.

The receiver observes a length-

M

superimposed signal; compressed-sensing recovery returns the

K_a

-sparse activity vector.

Example: Sizing Pilot Length for mMTC

A base station serves $K_{\text{total}} = 5000$ devices, of which at most $K_a = 50$ are active per coherence block. Active users transmit at SNR 5 dB. Estimate the minimum pilot length $M$ needed for reliable activity detection.

Solution

Logarithmic factor

$\log(K_{\text{total}}/K_a) = \log(100) \approx 4.6$ .

SNR factor

Linear SNR is $10^{0.5} \approx 3.16$ , so $\sigma^2/P_{\min} \approx 1/3.16 \approx 0.32$ .

Apply the bound

With constant $c \approx 4$ (typical for Gaussian pilots): $M \geq 4 \times 50 \times 4.6 \times 0.32 \approx 294$ . A pilot length of $M \approx 300$ suffices — about $6\%$ of the brute-force orthogonal requirement $M \geq K_{\text{total}} = 5000$ .

Operational meaning

Within a 1-ms coherence block that carries $\approx 1400$ symbols, $300$ can be used for activity detection and the rest for payload. Contrast with the orthogonal scheme which cannot fit.

🔧Engineering Note

Grant-Free Access in 3GPP

3GPP Release-17 RedCap (Reduced Capability) and Release-18 Ambient IoT study items introduce grant-free uplink transmission precisely to reduce the coordination overhead that activity detection solves information-theoretically. The dominant academic candidate for the physical-layer detector is the Fengler-Caire covariance method above, now cited in multiple 3GPP RAN1 contributions.

📋 Ref: 3GPP TR 38.869 (Ambient IoT)

Common Mistake: Collisions Are Not Failures

Mistake:

Treating two users with similar pilot signatures as an error of the CS detector.

Correction:

In unsourced random access, near-collisions are inherent — two codewords (not users) may land close in sensing space. The outer tree code stitches chunks and disambiguates, and the final performance metric is per-message error, not per-codeword. The CS layer is allowed to output a list with modest "extra" entries.

Unsourced random access

A massive-access model in which the base station seeks only the set of messages transmitted, not the identities of the transmitters. Performance is measured by the per-message error probability under a per-user energy constraint. Introduced by Polyanskiy (2017), developed into a practical receiver framework by the CommIT group.

Multiple-measurement vector (MMV)

An extension of CS in which several measurement vectors share the same sparse support. Appears naturally in massive-MIMO activity detection (one measurement per receive antenna) and in joint channel estimation across subcarriers.

Key Takeaway

Massive-access activity detection is the prototypical communications application of sparse recovery. Pilot length scales like $K_a \log(K_{\text{total}}/K_a)$ , and the covariance-based detector of Fengler, Haghighatshoar, Jung, and Caire is consistent even when $K_a > M$ provided the BS has enough receive antennas.

Why This Matters: Connection to NOMA and Grant-Free Uplink

Non-orthogonal multiple access (NOMA) and grant-free uplink are the standardization descendants of the massive-access theory developed here. In both, users transmit without explicit resource allocation; the receiver untangles them by exploiting sparsity (few simultaneous transmitters) and the CS detectors of this section.

Quick Check

For $K_{\text{total}} = 10^4$ potential users and $K_a = 100$ active at any time, which pilot length most closely matches the CS bound?

$M \approx 10$

$M \approx 100$

$M \approx 500$

$M \approx 10000$