The Limits of Uncoded Caching
The Missing Ingredient
Popularity caching is optimal within the uncoded-caching regime, but we are about to show that the uncoded regime itself leaves money on the table. The key observation is that popularity caching does nothing to exploit the multiplicity of users. If users request different files, the server must unicast separate messages β one per user β regardless of how much each user has cached. Information-theoretically, this is wasteful.
This section quantifies what uncoded caching cannot do, setting up the gap that coded caching will close.
Theorem: Worst-Case Rate Under Uncoded Caching
Under any uncoded caching strategy with per-user cache size , the worst-case delivery rate over all demand vectors satisfies This bound is achieved with equality by popularity caching when and demands are distinct.
Consider the worst case where every user requests a different file. Any single user caches at most a fraction of its own requested file (by symmetry, any file it could request). Hence each user must receive at least files' worth of data from the server. With users, the total load is at least . Caching saves each user a fraction of its file β nothing more.
Consider worst-case demands
Choose demands so that all users request distinct files, which is possible whenever . Denote these distinct files as .
Lower-bound each user's reception
Under uncoded delivery, the message sent to user must contain enough information for user to recover all bits of . The cache covers at most files' worth of content, so the fraction of that user does not already have is at least on average across all choices of consistent with a symmetric placement.
Sum over users
Because uncoded delivery cannot multicast useful bits to more than one user (different users want different files), the total rate is at least the sum of individual rates:
Example: A Concrete Number
Netflix-sized thought experiment: movies on offer, simultaneous users during peak hours, each with a local cache of movies (10% of the library). What is the peak-hour shared-link load under popularity caching and worst-case demands (say, a new blockbuster release spreads demand across 1000 different titles)?
Apply the worst-case bound
movies/delivery round. Caching saves only 10% of peak load.
Preview the coded result
The MAN bound from Chapter 2 gives movies/delivery round. A factor of improvement β if the theory can be realized in a system. The rest of this book is about both the realization and the subtleties hidden in this "if."
Identify the key gain
The factor is exactly the coded multicasting gain. For any caching system with enough users ( large), popularity caching leaves most of the potential savings untouched. This is the operational meaning of the coded caching gain.
Multiplicative vs. Additive Gains
Caching provides two conceptually distinct gains:
- Local caching gain (). This is the classical hit-rate benefit β each user individually avoids fetching what it already has. Available to any caching system; captured fully by popularity caching.
- Coded multicasting gain (). This is the combinatorial gain from exploiting coincidences across users' caches. It requires both combinatorial placement and coded delivery. Invisible to popularity caching.
The two gains multiply: under the MAN scheme, As we scale up the network ( at fixed ), the local gain is bounded but the coded gain grows unboundedly. This is why coded caching matters at scale.
Caching Loses Traction as Libraries Grow
With fixed absolute cache size and Zipf-distributed demand, the hit ratio collapses as the library grows. This is the fundamental limitation of popularity caching in an era where content libraries grow faster than edge storage. Coded caching reframes the problem: the figure of merit is the ratio , not the absolute cache size.
Parameters
Common Mistake: Uncoded Caching Scales Linearly in K
Mistake:
Thinking that adding more users to a shared-link network with popularity caching improves per-user delivery cost (via "economies of scale").
Correction:
Exactly the opposite: under popularity caching, the shared-link load is β linear in . Adding users linearly degrades the per-round link usage. Coded caching turns this linear growth into a bounded one: for fixed memory ratio , the rate approaches as . The gain is unbounded in , which is the killer feature of coded caching at CDN scale.
Historical Note: Web Caching and the Zipf Assumption
1998β2014The field of web caching in the 1990sβ2000s was built on the assumption that web requests follow a Zipf distribution. Breslau et al.'s 1999 study of web traffic was seminal: it found Zipf exponents between 0.6 and 0.8 for page requests. Video streaming a decade later confirmed similar exponents (0.6β1.2 depending on the platform). These numbers are good for popularity caching but not so good: even leaves substantial demand in the long tail, where popularity caching cannot reach.
The industry's response was to make caches bigger and distribute them more widely. The Caire group's response, starting in 2013β2014, was to rethink the problem at the information-theoretic level. Both responses have merit; this book develops the latter.
Quick Check
Under uncoded caching with users, files, files, and worst-case distinct demands, what is the minimum achievable shared-link delivery rate (in file units)?
1
10
90
100
. Local caching saves 10% of each user's load; distinct demands forbid multicast savings.