Normalized Delivery Time (NDT)
Why a New Metric?
Delivery time depends on SNR: at higher SNR you deliver faster. To compare cache-aided and baseline architectures fairly, we need to normalize out the SNR dependence. The result is the normalized delivery time (NDT), , which measures delivery time in units of the reference baseline time. NDT = 1 means the cache- aided architecture delivers as fast as an infinite-fronthaul baseline; NDT > 1 means slower; NDT < 1 is impossible (baseline is the best possible).
This metric, introduced by Sengupta-Tandon-Simeone (2017) and refined by the CommIT group in many settings, is the workhorse of C-RAN caching analysis. It isolates the cache-fronthaul tradeoff from the underlying SNR and channel details.
Definition: Normalized Delivery Time
Normalized Delivery Time
For a cache-aided C-RAN with delivery time and reference delivery time (baseline single-user MU-MIMO), the normalized delivery time is NDT = 1 means "delivers as fast as MU-MIMO with infinite fronthaul." NDT > 1 means "slower β the fronthaul/cache bottleneck dominates." NDT < 1 is impossible.
The NDT is a function of two key parameters: the per-EN cache and the per-EN fronthaul .
NDT is a dimensionless ratio, analogous to the DoF in Chapter 5 β it isolates the scaling-invariant tradeoff. At high SNR, the absolute delivery time is , capturing both the SNR-dependent link rate and the architectural overhead.
Theorem: Fundamental NDT Bounds
For a cloud-RAN with ENs, cache per EN, fronthaul per EN, users, and library : Roughly: , separating the fronthaul-limited term from the downlink-limited term.
Delivery time is governed by two bottlenecks: (i) the fronthaul must carry files' worth of data to the ENs; (ii) the downlink must carry the same amount to users. NDT is dominated by the slower.
When : NDT = (downlink limited). When : NDT = 1 (baseline reached). When : NDT unless .
Fronthaul contribution
Each EN needs files' worth of content delivered from cloud (the rest is cached). Per-EN fronthaul time: per channel use.
Downlink contribution
The ENs cooperatively deliver files to users via MIMO BC with effective antenna count . Delivery time: .
Total NDT
NDT is the sum of fronthaul and downlink contributions, lower- bounded by 1 (baseline). The derivation in Sengupta-Tandon- Simeone '17 gives the precise bounds.
Saturation
At : fully cached; no fronthaul needed; NDT = 1. At , : NDT = (MU-MIMO baseline).
NDT vs Fronthaul Capacity
NDT as a function of per-EN fronthaul capacity , for fixed cache size and varying memory ratios. At low , NDT is large (fronthaul bottleneck); it decreases as grows, hitting 1 when fronthaul is abundant. Higher cache shifts the curve down β cache reduces the fronthaul demand.
Parameters
Example: NDT Computation for a 5G Small Cell
A small-cell C-RAN has ENs, users, antennas each, fronthaul files/use, cache . Compute the NDT and identify the bottleneck.
Fronthaul contribution
.
Downlink contribution
.
Total NDT
. Above the baseline of 1.
Bottleneck
Fronthaul (1.5) > downlink (0.75). Increasing fronthaul from 2 to 4 would halve the fronthaul contribution to 0.75, giving . Doubling cache to 0.5 gives fronthaul , downlink , total . Equivalent improvement via either resource.
Operational insight
At this operating point, cache and fronthaul are substitutes: doubling either has the same effect on NDT. At other operating points the substitution rate changes; the NDT surface captures this.
Key Takeaway
NDT unifies cache size and fronthaul capacity into a single latency metric. It removes the SNR dependence and exposes the architectural tradeoff directly. NDT = 1 is the ideal (infinite fronthaul baseline); higher NDT means slower delivery. System design with the NDT framework: pick to meet a latency budget.
The NDT Framework for Cache-Aided Cloud-RAN
The NDT framework was introduced in a series of papers by Simeone's group, with Caire and collaborators co-authoring extensions to multi-antenna and cooperative settings. The key contribution:
- NDT as a unifying metric. Captures the cache-fronthaul tradeoff in a scale-invariant number that removes SNR dependencies.
- Achievability schemes. Cooperative Lampiris-Caire style delivery with fronthaul-aware placement; time-share between cache-heavy and fronthaul-heavy modes.
- Converse bounds. Information-theoretic lower bounds on NDT, tight at several operating points.
- Practical implications. The framework informs the architectural choice of where to put cache and how much fronthaul to provision.
The CommIT follow-up work has extended NDT to mixed-traffic (Park-Caire 2020), privacy (Wan-Caire 2022), and massive MIMO (Lampiris-Bhattacharjee-Caire 2023). Chapter 8 of this book presents the baseline framework; subsequent chapters touch on extensions.
Historical Note: From C-RAN to Cache-Aided C-RAN
2011β2022The C-RAN concept (China Mobile, 2011) preceded coded caching by three years. The initial motivation was centralizing baseband processing to reduce BS cost and enable coordinated multi-point transmission (CoMP). Caching at the edge was not part of the original vision.
Simeone and collaborators (2015β2017) connected the two: once you have ENs with local storage (needed for CoMP buffering anyway), why not pre-cache popular content? The NDT framework crystallized this intuition. By 2020, 3GPP Rel-16 added caching hooks at the DU/ CU levels. Cache-aided C-RAN is now a mainstream research topic with direct deployment paths.
The CommIT group's contribution: unifying the NDT framework with the multi-antenna coded caching theory of Lampiris-Caire. Fog massive MIMO β CommIT's flagship 6G architecture β is C-RAN + caching + massive MIMO, analyzed via NDT + DoF metrics.