The Limits of Uncoded Caching

The Missing Ingredient

Popularity caching is optimal within the uncoded-caching regime, but we are about to show that the uncoded regime itself leaves money on the table. The key observation is that popularity caching does nothing to exploit the multiplicity of users. If KK users request KK different files, the server must unicast KK separate messages β€” one per user β€” regardless of how much each user has cached. Information-theoretically, this is wasteful.

This section quantifies what uncoded caching cannot do, setting up the gap that coded caching will close.

Theorem: Worst-Case Rate Under Uncoded Caching

Under any uncoded caching strategy with per-user cache size MM, the worst-case delivery rate over all demand vectors d∈[N]K\mathbf{d} \in [N]^K satisfies Runc(M)β€…β€Šβ‰₯β€…β€ŠK(1βˆ’MN).R_{\text{unc}}(M) \;\geq\; K\left(1 - \frac{M}{N}\right). This bound is achieved with equality by popularity caching when K≀NK \leq N and demands are distinct.

Consider the worst case where every user requests a different file. Any single user caches at most a fraction M/NM/N of its own requested file (by symmetry, any file it could request). Hence each user must receive at least (1βˆ’M/N)(1 - M/N) files' worth of data from the server. With KK users, the total load is at least K(1βˆ’M/N)K(1 - M/N). Caching saves each user a fraction M/NM/N of its file β€” nothing more.

Example: A Concrete Number

Netflix-sized thought experiment: N=105N = 10^5 movies on offer, K=106K = 10^6 simultaneous users during peak hours, each with a local cache of M=104M = 10^4 movies (10% of the library). What is the peak-hour shared-link load under popularity caching and worst-case demands (say, a new blockbuster release spreads demand across 1000 different titles)?

Multiplicative vs. Additive Gains

Caching provides two conceptually distinct gains:

  • Local caching gain (1βˆ’M/N1 - M/N). This is the classical hit-rate benefit β€” each user individually avoids fetching what it already has. Available to any caching system; captured fully by popularity caching.
  • Coded multicasting gain (1+KM/N1 + KM/N). This is the combinatorial gain from exploiting coincidences across users' caches. It requires both combinatorial placement and coded delivery. Invisible to popularity caching.

The two gains multiply: under the MAN scheme, Rβ€…β€Š=β€…β€ŠK (1βˆ’ΞΌ)⏟localβ‹…11+Kμ⏟coded.R \;=\; K\, \underbrace{(1 - \mu)}_{\text{local}} \cdot \underbrace{\frac{1}{1 + K \mu}}_{\text{coded}}. As we scale up the network (Kβ†’βˆžK \to \infty at fixed M/NM/N), the local gain is bounded but the coded gain grows unboundedly. This is why coded caching matters at scale.

Caching Loses Traction as Libraries Grow

With fixed absolute cache size MM and Zipf-distributed demand, the hit ratio collapses as the library NN grows. This is the fundamental limitation of popularity caching in an era where content libraries grow faster than edge storage. Coded caching reframes the problem: the figure of merit is the ratio M/NM/ N, not the absolute cache size.

Parameters
50
0.8

Common Mistake: Uncoded Caching Scales Linearly in K

Mistake:

Thinking that adding more users to a shared-link network with popularity caching improves per-user delivery cost (via "economies of scale").

Correction:

Exactly the opposite: under popularity caching, the shared-link load is K(1βˆ’h(M))K(1 - h(M)) β€” linear in KK. Adding users linearly degrades the per-round link usage. Coded caching turns this linear growth into a bounded one: for fixed memory ratio M/NM/N, the rate RMANR_{\text{MAN}} approaches (1βˆ’M/N)/(M/N)(1-M/N)/(M/N) as Kβ†’βˆžK \to \infty. The gain is unbounded in KK, which is the killer feature of coded caching at CDN scale.

Historical Note: Web Caching and the Zipf Assumption

1998–2014

The field of web caching in the 1990s–2000s was built on the assumption that web requests follow a Zipf distribution. Breslau et al.'s 1999 study of web traffic was seminal: it found Zipf exponents between 0.6 and 0.8 for page requests. Video streaming a decade later confirmed similar exponents (0.6–1.2 depending on the platform). These numbers are good for popularity caching but not so good: even Ξ±=1.0\alpha = 1.0 leaves substantial demand in the long tail, where popularity caching cannot reach.

The industry's response was to make caches bigger and distribute them more widely. The Caire group's response, starting in 2013–2014, was to rethink the problem at the information-theoretic level. Both responses have merit; this book develops the latter.

Quick Check

Under uncoded caching with K=100K = 100 users, N=1000N = 1000 files, M=100M = 100 files, and worst-case distinct demands, what is the minimum achievable shared-link delivery rate (in file units)?

1

10

90

100