Activity Detection in Massive Access

The Massive-Access Bottleneck

Modern IoT and massive-machine-type communication (mMTC) scenarios involve a huge population of potential devices β€” think Ktotal∼104K_{\text{total}} \sim 10^4 β€” of which only a tiny fraction Kaβ‰ͺKtotalK_a \ll K_{\text{total}} is active at any given time. Classical access protocols rely on handshakes, requests, and grants. With 10410^4 devices, this overhead is prohibitive: the coordination itself consumes more airtime than the data. The point is that "who is transmitting?" is a sparse-recovery question. If we give each device a unique pilot sequence and observe a single superimposed uplink, the unknown activity vector is KaK_a-sparse in KtotalK_{\text{total}} dimensions, and the compressed-sensing machinery identifies the active set from M∼Kalog⁑(Ktotal/Ka)M \sim K_a \log(K_{\text{total}}/K_a) channel uses.

Definition:

Activity Vector and Massive-Access Model

Let KtotalK_{\text{total}} denote the population of potential users and, at a given coherence block, let KaK_a of them transmit. Assign to user kk the unit-norm pilot signature Ο•k∈CM\boldsymbol{\phi}_k \in \mathbb{C}^M and collect them as columns of Φ∈CMΓ—Ktotal\boldsymbol{\Phi} \in \mathbb{C}^{M \times K_{\text{total}}}. Let xk∈Cx_k \in \mathbb{C} encode user kk's transmitted symbol (zero if inactive). The base station observes y=Φ x+w,w∼CN(0,Οƒ2 IM).\mathbf{y} = \boldsymbol{\Phi}\,\mathbf{x} + \mathbf{w},\qquad \mathbf{w} \sim \mathcal{CN}(\mathbf{0}, \sigma^2\, \mathbf{I}_M). The activity vector x\mathbf{x} is KaK_a-sparse, and activity detection is the problem of recovering its support S={k:xkβ‰ 0}\mathcal{S} = \{k : x_k \neq 0\}.

When users are additionally equipped with NrN_r receive antennas, the model becomes a multiple-measurement vector (MMV) problem: Y=Ξ¦X+W\mathbf{Y} = \boldsymbol{\Phi}\mathbf{X} + \mathbf{W} with X\mathbf{X} row-sparse β€” the same users are active across all receive antennas.

Theorem: Sample Complexity of Activity Detection

Suppose pilots Ο•k\boldsymbol{\phi}_k are i.i.d.\ CN(0,IM/M)\mathcal{CN}(\mathbf{0}, \mathbf{I}_M/M) and active symbols have power bounded below by Pmin⁑P_{\min}. There is a constant c>0c > 0 such that if Mβ‰₯c Kalog⁑ ⁣(KtotalKa)β‹…Οƒ2Pmin⁑,M \geq c\, K_a \log\!\left(\frac{K_{\text{total}}}{K_a}\right) \cdot \frac{\sigma^2}{P_{\min}}, then the LASSO or non-negative LS (NNLS) activity detector correctly identifies S\mathcal{S} with probability β‰₯1βˆ’Ktotalβˆ’1\geq 1 - K_{\text{total}}^{-1}.

Pilot length scales logarithmically in KtotalK_{\text{total}} and linearly in KaK_a, plus a standard inverse-SNR factor. For Ktotal=104K_{\text{total}} = 10^4 and Ka=100K_a = 100, this is M∼100log⁑100β‰ˆ460M \sim 100 \log 100 \approx 460 channel uses, versus Ktotal=104K_{\text{total}} = 10^4 for a deterministic orthogonal scheme β€” a roughly 20Γ—20\times saving.

,
πŸŽ“CommIT Contribution(2021)

Unsourced Random Access (Fengler-Haghighatshoar-Jung-Caire)

A. Fengler, S. Haghighatshoar, P. Jung, G. Caire β€” IEEE Transactions on Information Theory, vol. 67, no. 5, pp. 2925-2951

In unsourced random access the base station only wants to recover the list of transmitted messages, not who sent them. Fengler, Haghighatshoar, Jung, and Caire showed that this problem maps onto a massive-MIMO activity-detection problem where the "users" are codewords. They proved that a covariance-based detector combined with a large-scale-fading estimator achieves the information-theoretic scaling laws of Polyanskiy's finite-blocklength bounds, without needing per-user identification. This work is one of the cornerstones of the CommIT group's research line on massive access and directly motivates 3GPP's Release-17 RedCap and ambient-IoT study items.

unsourced-random-accessmassive-mimomMTCcommitView Paper β†’
πŸŽ“CommIT Contribution(2022)

Coded Compressed Sensing for Unsourced MAC

V. K. Amalladinne, A. K. Pradhan, C. Rush, J.-F. Chamberland, K. R. Narayanan, G. Caire β€” IEEE Transactions on Information Theory, vol. 68, no. 4, pp. 2384-2409

Coded compressed sensing splits long messages into chunks, each mapped to a CS codeword, and stitches the chunks together via an outer tree code. This reduces the per-slot dictionary size from 2B2^B to 2B/L2^{B/L}, making CS computationally feasible even for modest payloads. The analysis β€” a joint AMP + belief-propagation decoder β€” is a CommIT-driven direction that fuses CS recovery with coding theory. Chapter 15 treats this as the canonical scalable architecture for unsourced access.

coded-csunsourcedampcommitView Paper β†’

Covariance-Based Activity Detection (Non-Bayesian)

Complexity: O(KexttotalM2)O(K_{ ext{total}} M^2) per coordinate-descent sweep; converges in tens of sweeps.
Input: Pilot book Φ∈CMΓ—Ktotal\boldsymbol{\Phi} \in \mathbb{C}^{M \times K_{\text{total}}}, multi-antenna observation Y∈CMΓ—Nr\mathbf{Y} \in \mathbb{C}^{M \times N_r}
Output: Estimated activity/LSFC vector γ^∈R+Ktotal\hat{\boldsymbol{\gamma}} \in \mathbb{R}_+^{K_{\text{total}}}
1. Form sample covariance Ξ£^y←1NrYYH\widehat{\boldsymbol{\Sigma}}_y \leftarrow \tfrac{1}{N_r} \mathbf{Y}\mathbf{Y}^H
2. Model Ξ£y=βˆ‘kΞ³kΟ•kΟ•kH+Οƒ2IM\boldsymbol{\Sigma}_y = \sum_k \gamma_k \boldsymbol{\phi}_k \boldsymbol{\phi}_k^H + \sigma^2\mathbf{I}_M
3. Solve ML: Ξ³^←arg⁑max⁑γβ‰₯0Β βˆ’log⁑det⁑Σy(Ξ³)βˆ’tr(Ξ£y(Ξ³)βˆ’1Ξ£^y)\hat{\boldsymbol{\gamma}} \leftarrow \arg\max_{\boldsymbol{\gamma} \geq 0}\ -\log\det \boldsymbol{\Sigma}_y(\boldsymbol{\gamma}) - \mathrm{tr}(\boldsymbol{\Sigma}_y(\boldsymbol{\gamma})^{-1} \widehat{\boldsymbol{\Sigma}}_y)
4. Use coordinate descent: at each step update one Ξ³k\gamma_k via the closed-form quadratic root
5. Threshold Ξ³^k\hat{\gamma}_k to declare activity

The covariance-based detector does not reconstruct X\mathbf{X}; it estimates only the large-scale fading power Ξ³k\gamma_k of each user. As Nrβ†’βˆžN_r \to \infty the estimator is consistent even when Ka>MK_a > M β€” the regime where β„“1\ell_1-based MMV methods fail.

Activity-Detection ROC

Sweep KaK_a, pilot length MM, and KtotalK_{\text{total}} and watch the ROC move. When MM falls below the threshold in the theorem above, the curve collapses onto the diagonal β€” activity detection fails.

Parameters
200
10
40
10

Many Users, Few Active: Activity Detection

Animated view of KtotalK_{\text{total}} potential users collapsing onto a short pilot observation and the sparse-recovery "lens" revealing only the KaK_a active ones.
The receiver observes a length-MM superimposed signal; compressed-sensing recovery returns the KaK_a-sparse activity vector.

Example: Sizing Pilot Length for mMTC

A base station serves Ktotal=5000K_{\text{total}} = 5000 devices, of which at most Ka=50K_a = 50 are active per coherence block. Active users transmit at SNR 5 dB. Estimate the minimum pilot length MM needed for reliable activity detection.

πŸ”§Engineering Note

Grant-Free Access in 3GPP

3GPP Release-17 RedCap (Reduced Capability) and Release-18 Ambient IoT study items introduce grant-free uplink transmission precisely to reduce the coordination overhead that activity detection solves information-theoretically. The dominant academic candidate for the physical-layer detector is the Fengler-Caire covariance method above, now cited in multiple 3GPP RAN1 contributions.

πŸ“‹ Ref: 3GPP TR 38.869 (Ambient IoT)

Common Mistake: Collisions Are Not Failures

Mistake:

Treating two users with similar pilot signatures as an error of the CS detector.

Correction:

In unsourced random access, near-collisions are inherent β€” two codewords (not users) may land close in sensing space. The outer tree code stitches chunks and disambiguates, and the final performance metric is per-message error, not per-codeword. The CS layer is allowed to output a list with modest "extra" entries.

Unsourced random access

A massive-access model in which the base station seeks only the set of messages transmitted, not the identities of the transmitters. Performance is measured by the per-message error probability under a per-user energy constraint. Introduced by Polyanskiy (2017), developed into a practical receiver framework by the CommIT group.

Related: Unsourced Random Access (Fengler-Haghighatshoar-Jung-Caire), Coded Compressed Sensing for Unsourced MAC

Multiple-measurement vector (MMV)

An extension of CS in which several measurement vectors share the same sparse support. Appears naturally in massive-MIMO activity detection (one measurement per receive antenna) and in joint channel estimation across subcarriers.

Related: Activity Vector and Massive-Access Model

Key Takeaway

Massive-access activity detection is the prototypical communications application of sparse recovery. Pilot length scales like Kalog⁑(Ktotal/Ka)K_a \log(K_{\text{total}}/K_a), and the covariance-based detector of Fengler, Haghighatshoar, Jung, and Caire is consistent even when Ka>MK_a > M provided the BS has enough receive antennas.

Why This Matters: Connection to NOMA and Grant-Free Uplink

Non-orthogonal multiple access (NOMA) and grant-free uplink are the standardization descendants of the massive-access theory developed here. In both, users transmit without explicit resource allocation; the receiver untangles them by exploiting sparsity (few simultaneous transmitters) and the CS detectors of this section.

Quick Check

For Ktotal=104K_{\text{total}} = 10^4 potential users and Ka=100K_a = 100 active at any time, which pilot length most closely matches the CS bound?

Mβ‰ˆ10M \approx 10

Mβ‰ˆ100M \approx 100

Mβ‰ˆ500M \approx 500

Mβ‰ˆ10000M \approx 10000