The Cache-Fronthaul Tradeoff

Two Resources, One Latency

The NDT formula Δ=max(1,K(1μ)/(NENC)+K(1μ)/(NENLEN))\Delta = \max(1, K(1-\mu)/(N_\text{EN} C) + K(1-\mu)/(N_\text{EN} L_\text{EN})) reveals a fundamental substitution: cache and fronthaul capacity can be traded off to achieve the same delivery time. An operator can choose a cache-heavy architecture (cheap cache at each EN, modest fronthaul) or a cloud-heavy one (lean ENs, fat fronthaul). Both can deliver the same NDT.

The Pareto frontier — the set of (M,C)(M, C) pairs achieving a given NDT target — defines the architectural design space. This section characterizes this frontier and relates it to deployment costs.

Theorem: Iso-NDT Contours

For fixed K,NEN,LENK, N_\text{EN}, L_\text{EN} and target NDT Δ\Delta^*, the set of (M/N,CF)(M/N, C_F) pairs achieving ΔΔ\Delta \leq \Delta^* is {(μ,C):K(1μ)NENC+K(1μ)NENLENΔ1ifΔ>1}.\left\{(\mu, C) : \frac{K(1-\mu)}{N_\text{EN} C} + \frac{K(1-\mu)}{N_\text{EN} L_\text{EN}} \leq \Delta^* - 1_\text{if} \, \Delta^* > 1\right\}. The iso-NDT contours in the (μ,C)(\mu, C) plane are hyperbolae: smooth tradeoff between cache and fronthaul at any given latency target.

Doubling the cache fraction μ\mu halves the "1 - μ\mu" factor, doubling the available slack. Doubling fronthaul CC halves the fronthaul contribution. Either works to meet the NDT budget.

NDT vs Memory Ratio

NDT as a function of memory ratio μ=M/N\mu = M/N, for varying fronthaul capacities CC. At high CC, even small cache is enough to hit Δ=1\Delta = 1. At low CC, need large cache to approach NDT = 1. This is the "vertical" view of the cache-fronthaul tradeoff: hold CC fixed, vary cache.

Parameters
4
4
1

NDT Surface Δ(M/N,C)\Delta(M/N, C)

3D surface plot of the NDT as a function of both cache ratio and fronthaul capacity. The surface is monotonically non-increasing in both variables; iso-NDT curves are the level sets. The Pareto frontier for any target Δ\Delta^* is an iso-NDT contour on this surface.

Parameters
4
4

NDT Cache-Fronthaul Tradeoff Curves

NDT Δ\Delta as a function of memory ratio M/NM/N, for four fronthaul capacities C{0.5,1,2,5}C \in \{0.5, 1, 2, 5\} files/use. Smaller CC shifts the curves up; larger cache lowers NDT toward the baseline Δ=1\Delta = 1. The curves are iso-CC "slices" of the full NDT surface.

Example: Architectural Choice: Cloud-Heavy vs Cache-Heavy

A 5G deployment has K=100K = 100, NEN=10N_\text{EN} = 10, LEN=4L_\text{EN} = 4. Target NDT 2\leq 2. Two architectures: (A) Cloud-heavy: μ=0.05\mu = 0.05, find minimum CC. (B) Cache-heavy: C=1C = 1, find minimum μ\mu.

Definition:

Marginal Substitution Rate

At a given operating point (μ0,C0)(\mu_0, C_0) on an iso-NDT contour, the marginal substitution rate C/μ-\partial C/\partial \mu measures how much fronthaul can be saved by adding one unit of cache. From the NDT formula: CμΔ  =  NENC02K1(1μ0)2.-\frac{\partial C}{\partial \mu}\Bigg|_{\Delta^*} \;=\; \frac{N_\text{EN} C_0^2}{K} \cdot \frac{1}{(1 - \mu_0)^2}. This rate is steep near the cache-limited corner (small μ\mu, need lots of cache growth for small CC reduction) and shallow near the fronthaul-limited corner.

In deployment economics: buy cache if cM<cCC/μΔc_M < c_C \cdot |\partial C/\partial \mu|_{\Delta^*}, where cM,cCc_M, c_C are unit costs. This determines the optimal architecture along the iso-NDT curve.

Pareto Frontier in Operator Terms

The NDT iso-contour is the Pareto frontier for architectural decisions: any point on the contour is Pareto-optimal (no other point dominates it in both μ\mu and CC). Deployment decisions slide along this frontier based on:

  1. Cost structure. Cache hardware (storage + controllers) vs. fronthaul bandwidth lease/installation.
  2. Content catalog size. Larger library needs more aggregate cache; fixed μ\mu scales with NN.
  3. Traffic pattern. Cacheable (video) benefits from cache; mixed traffic has JLEC separation concerns (Ch 6).
  4. Access latency. Cache at RU gives 0.1\sim 0.1 ms access; cloud fronthaul adds 1-10\sim 1\text{-}10 ms.

A typical 5G deployment sits in the "balanced" region of the Pareto frontier, with moderate cache (μ0.1-0.3\mu \sim 0.1\text{-}0.3) and moderate fronthaul (C1-5C \sim 1\text{-}5 files/use at high SNR).

Common Mistake: NDT Is Not a Rate

Mistake:

Confusing NDT with a throughput or capacity value.

Correction:

NDT is a dimensionless latency ratio. Δ=2\Delta = 2 means "delivery takes twice as long as the baseline MU-MIMO." It does not mean "rate is half" — the relation to rate involves the SNR and the reference baseline. For rate conversions: Rper-user=log2SNRK/(TNusers)=log2SNR/ΔR_\text{per-user} = \log_2 \text{SNR} \cdot K/(T \cdot N_\text{users}) = \log_2 \text{SNR} / \Delta at high SNR.

NDT = 1 corresponds to per-user rate = log2SNR/1\log_2 \text{SNR}/1 — the asymptotic single-user capacity.