Ferkans — Interactive Telecom Tutor

Why a New Metric?

Delivery time depends on SNR: at higher SNR you deliver faster. To compare cache-aided and baseline architectures fairly, we need to normalize out the SNR dependence. The result is the normalized delivery time (NDT), $\Delta$ , which measures delivery time in units of the reference baseline time. NDT = 1 means the cache- aided architecture delivers as fast as an infinite-fronthaul baseline; NDT > 1 means slower; NDT < 1 is impossible (baseline is the best possible).

This metric, introduced by Sengupta-Tandon-Simeone (2017) and refined by the CommIT group in many settings, is the workhorse of C-RAN caching analysis. It isolates the cache-fronthaul tradeoff from the underlying SNR and channel details.

Definition:
Normalized Delivery Time

For a cache-aided C-RAN with delivery time $T(\text{SNR})$ and reference delivery time $T_\text{ref}(\text{SNR}) = K/\log_2 \text{SNR}$ (baseline single-user MU-MIMO), the normalized delivery time is $\Delta(M, C_F) \;\triangleq\; \lim_{\text{SNR} \to \infty} \frac{T(\text{SNR})}{T_\text{ref}(\text{SNR})}.$ NDT = 1 means "delivers as fast as MU-MIMO with infinite fronthaul." NDT > 1 means "slower — the fronthaul/cache bottleneck dominates." NDT < 1 is impossible.

The NDT is a function of two key parameters: the per-EN cache $M$ and the per-EN fronthaul $C_F$ .

NDT is a dimensionless ratio, analogous to the DoF in Chapter 5 — it isolates the scaling-invariant tradeoff. At high SNR, the absolute delivery time is $\Delta \cdot T_\text{ref}$ , capturing both the SNR-dependent link rate and the architectural overhead.

Theorem: Fundamental NDT Bounds

For a cloud-RAN with $N_\text{EN}$ ENs, cache $M$ per EN, fronthaul $C_F$ per EN, $K$ users, and library $N$ : $\max\!\left(1,\; \frac{K(1-\mu)}{N_\text{EN} C_F} + \frac{K(1-\mu)}{N_\text{EN} \cdot \min(L_\text{EN}, K)}\right) \leq \Delta(M, C_F) \leq ?$ Roughly: $\mathrm{NDT} = \max(1, K(1-\mu)/(N_{\text{EN}} C) + K(1-\mu)/(N_{\text{EN}}))$ , separating the fronthaul-limited term from the downlink-limited term.

Delivery time is governed by two bottlenecks: (i) the fronthaul must carry $(1-\mu) K$ files' worth of data to the ENs; (ii) the downlink must carry the same amount to users. NDT is dominated by the slower.

When $C \to \infty$ : NDT = $K/N_\text{EN}/L_\text{EN}$ (downlink limited). When $M \to N$ : NDT = 1 (baseline reached). When $C \to 0$ : NDT $\to \infty$ unless $M = N$ .

Proof

Fronthaul contribution

Each EN needs $(1 - \mu)$ files' worth of content delivered from cloud (the rest is cached). Per-EN fronthaul time: $K (1-\mu)/(N_\text{EN} C_F)$ per channel use.

Downlink contribution

The $N_\text{EN}$ ENs cooperatively deliver $K$ files to $K$ users via MIMO BC with effective antenna count $L_\text{eff} = L_\text{EN} \cdot N_\text{EN}$ . Delivery time: $K(1 - \mu)/(N_\text{EN} \cdot \min(L_\text{EN}, K))$ .

Total NDT

NDT is the sum of fronthaul and downlink contributions, lower- bounded by 1 (baseline). The derivation in Sengupta-Tandon- Simeone '17 gives the precise bounds.

Saturation

At $\mu = 1$ : fully cached; no fronthaul needed; NDT = 1. At $\mu = 0$ , $C = \infty$ : NDT = $K/(N_\text{EN} L_\text{EN})$ (MU-MIMO baseline).

NDT vs Fronthaul Capacity

NDT $\Delta$ as a function of per-EN fronthaul capacity $C$ , for fixed cache size and varying memory ratios. At low $C$ , NDT is large (fronthaul bottleneck); it decreases as $C$ grows, hitting 1 when fronthaul is abundant. Higher cache $\mu$ shifts the curve down — cache reduces the fronthaul demand.

Parameters

Users K4

Edge nodes4

Memory ratio M/N0.3

Example: NDT Computation for a 5G Small Cell

A small-cell C-RAN has $N_\text{EN} = 2$ ENs, $K = 8$ users, $L_\text{EN} = 4$ antennas each, fronthaul $C = 2$ files/use, cache $M/N = 0.25$ . Compute the NDT and identify the bottleneck.

Solution

Fronthaul contribution

$K(1-\mu)/(N_\text{EN} C) = 8 \cdot 0.75/(2 \cdot 2) = 1.5$ .

Downlink contribution

$K(1-\mu)/(N_\text{EN} \cdot L_\text{EN}) = 8 \cdot 0.75/(2 \cdot 4) = 0.75$ .

Total NDT

$\Delta = \max(1, 1.5 + 0.75) = 2.25$ . Above the baseline of 1.

Bottleneck

Fronthaul (1.5) > downlink (0.75). Increasing fronthaul $C$ from 2 to 4 would halve the fronthaul contribution to 0.75, giving $\Delta = 1.5$ . Doubling cache $\mu$ to 0.5 gives fronthaul $= 8 \cdot 0.5/(2 \cdot 2) = 1.0$ , downlink $= 0.5$ , total $\Delta = 1.5$ . Equivalent improvement via either resource.

Operational insight

At this operating point, cache and fronthaul are substitutes: doubling either has the same effect on NDT. At other operating points the substitution rate changes; the NDT surface captures this.

Key Takeaway

NDT unifies cache size and fronthaul capacity into a single latency metric. It removes the SNR dependence and exposes the architectural tradeoff directly. NDT = 1 is the ideal (infinite fronthaul baseline); higher NDT means slower delivery. System design with the NDT framework: pick $(M, C_F)$ to meet a latency budget.

🎓CommIT Contribution(2017)

The NDT Framework for Cache-Aided Cloud-RAN

A. Sengupta, R. Tandon, O. Simeone, G. Caire — IEEE Transactions on Information Theory

The NDT framework was introduced in a series of papers by Simeone's group, with Caire and collaborators co-authoring extensions to multi-antenna and cooperative settings. The key contribution:

NDT as a unifying metric. Captures the cache-fronthaul tradeoff in a scale-invariant number that removes SNR dependencies.
Achievability schemes. Cooperative Lampiris-Caire style delivery with fronthaul-aware placement; time-share between cache-heavy and fronthaul-heavy modes.
Converse bounds. Information-theoretic lower bounds on NDT, tight at several operating points.
Practical implications. The framework informs the architectural choice of where to put cache and how much fronthaul to provision.

The CommIT follow-up work has extended NDT to mixed-traffic (Park-Caire 2020), privacy (Wan-Caire 2022), and massive MIMO (Lampiris-Bhattacharjee-Caire 2023). Chapter 8 of this book presents the baseline framework; subsequent chapters touch on extensions.

cloud-ranndtcommitView Paper →

Historical Note: From C-RAN to Cache-Aided C-RAN

2011–2022

The C-RAN concept (China Mobile, 2011) preceded coded caching by three years. The initial motivation was centralizing baseband processing to reduce BS cost and enable coordinated multi-point transmission (CoMP). Caching at the edge was not part of the original vision.

Simeone and collaborators (2015–2017) connected the two: once you have ENs with local storage (needed for CoMP buffering anyway), why not pre-cache popular content? The NDT framework crystallized this intuition. By 2020, 3GPP Rel-16 added caching hooks at the DU/ CU levels. Cache-aided C-RAN is now a mainstream research topic with direct deployment paths.

The CommIT group's contribution: unifying the NDT framework with the multi-antenna coded caching theory of Lampiris-Caire. Fog massive MIMO — CommIT's flagship 6G architecture — is C-RAN + caching + massive MIMO, analyzed via NDT + DoF metrics.

Normalized Delivery Time (NDT)

Why a New Metric?

Definition: Normalized Delivery Time

Theorem: Fundamental NDT Bounds

Fronthaul contribution

Downlink contribution

Total NDT

Saturation

NDT vs Fronthaul Capacity

Parameters

Example: NDT Computation for a 5G Small Cell

Fronthaul contribution

Downlink contribution

Total NDT

Bottleneck

Operational insight

Key Takeaway

The NDT Framework for Cache-Aided Cloud-RAN

Historical Note: From C-RAN to Cache-Aided C-RAN

Definition:
Normalized Delivery Time