The Coded Caching Problem
Why Coded Caching?
The explosive growth of video streaming β Netflix, YouTube, and their successors β has made content delivery the dominant source of network traffic. The standard solution is uncoded caching: place popular files at edge caches close to users, so that frequently requested content can be served locally without burdening the backhaul. This is purely a local gain: each cached file benefits only the user who happens to request it.
In 2014, Maddah-Ali and Niesen discovered something remarkable: by designing the cache placement jointly across all users, the server can create coded multicast opportunities during delivery. The result is a global caching gain that scales with the number of users , not just with the cache size . A single coded transmission simultaneously satisfies the demands of multiple users, each of whom can decode its desired content using its cached side information.
The point is that caching is not just about storing popular files β it is about creating side information that enables efficient coded delivery. This is an information-theoretic insight with deep connections to index coding, broadcast channels, and network information theory.
Definition: The Maddah-Ali-Niesen Caching Model
The Maddah-Ali-Niesen Caching Model
A server stores files , each of size bits. There are users, each with a cache of size bits (). The system operates in two phases:
Phase 1 β Placement (off-peak hours): The server fills the users' caches without knowledge of future demands. User 's cache content is , where is the caching function satisfying .
Phase 2 β Delivery (peak hours): Each user reveals its demand . The server transmits a coded message of size bits over a shared error-free link, where is the delivery load (in file units). User must decode from .
The goal is to minimize the worst-case delivery load: over all placement and delivery strategies.
The model assumes: (1) the shared link is error-free with unlimited bandwidth (we remove this assumption in Section 27.3), (2) placement is done without knowledge of future demands (this is critical β if demands were known during placement, the problem would be trivial), and (3) all files have equal size.
Coded caching
A caching strategy where the cache placement is designed to create multicast coding opportunities during the delivery phase, achieving a global caching gain that scales with the number of users.
Related: Coded multicasting gain, Cache placement
Cache placement
The first phase of a caching protocol, where the server fills user caches without knowledge of future demands. In coded caching, the placement is designed to maximize multicast opportunities during delivery.
Related: Coded caching
Coded Caching: The Example
Example: Coded Caching with ,
A server has files . There are users, each with cache size (one file). Compare uncoded and coded caching.
Uncoded caching
Placement: User 1 caches , User 2 caches (or any fixed assignment).
Delivery: If user 1 wants and user 2 wants , the server must send both files: . But user 1 already has and user 2 already has ! We are sending information that the users already possess.
Coded caching
Placement: Split each file into two halves: , .
- User 1 caches (first half of each file)
- User 2 caches (second half of each file)
Cache size: . Correct.
Delivery: If user 1 wants (needs ) and user 2 wants (needs ): Server sends (XOR of the two missing halves).
- User 1 recovers (since it has ).
- User 2 recovers (since it has ).
Delivery load: file. That is a reduction!
Key insight
The coded placement creates symmetric side information: each user has something the other needs. The XOR simultaneously serves both users. This is the simplest instance of the coded multicasting gain.
Definition: The MAN Scheme
The MAN Scheme
For integer-valued , the Maddah-Ali-Niesen (MAN) scheme operates as follows:
Placement: Each file is split into equal-sized sub-files: User caches all sub-files where : Cache size verification: user stores sub-files of size , totaling . Correct.
Delivery: For each subset with , the server transmits:
This is an XOR of sub-files. Each user has all sub-files except (which it needs), so it can decode.
The delivery load is:
Theorem: Delivery Load of the MAN Scheme
The MAN coded caching scheme achieves a delivery load of:
for with , and by memory sharing (convex combination) for non-integer . This can be decomposed as:
The first factor is the load that uncoded caching would achieve (each user's missing fraction times the number of users). The second factor is the coded multicasting gain: users are simultaneously served by each multicast message.
The coded multicasting gain grows linearly with the number of users (at fixed ). This is the fundamental surprise of coded caching: adding more users does not increase the load proportionally; instead, each new user contributes to more multicast opportunities. With users and (each user caches 10% of the library), the coded gain is , meaning each transmission serves 11 users simultaneously.
Number of transmissions
There are subsets of size . Each transmission is one XOR of sub-files, each of size bits. Total load: .
Decodability
User needs sub-file . For every other , user has cached because and . So user can cancel all other terms in the XOR and recover its desired sub-file.
Memory sharing for non-integer $t$
For non-integer , write with , . Split each file into two parts with fractions and , and apply the MAN scheme with parameters and respectively. The resulting load is , which is the lower convex envelope of the integer- points.
Memory-Load Tradeoff: Coded vs Uncoded Caching
Compare the delivery load of coded caching (MAN scheme) with uncoded caching as a function of cache size . Observe how the coded multicasting gain increases with the number of users .
Parameters
Historical Note: The Discovery of Coded Caching
2012-2014Mohammad Ali Maddah-Ali and Urs Niesen, both at Bell Labs, published their coded caching result in 2014, though the first version appeared on arXiv in 2012. The paper was initially met with skepticism: the idea that caching could provide a gain proportional to (not just ) seemed too good to be true. But the proof was elegantly simple β a counting argument showing that the XOR structure works. The result opened a new subfield of information theory, generating hundreds of papers on variations: decentralized placement, heterogeneous caches, online caching, multi-server systems, and the wireless extensions we study in Section 27.3. Caire's group at TU Berlin made fundamental contributions to the wireless extensions, connecting coded caching to MIMO broadcasting and D2D communication.
Common Mistake: The Subpacketization Bottleneck
Mistake:
Assuming the MAN scheme is practical for large . With users and , each file must be split into sub-files. For a 1 GB file, each sub-file would be less than 1 bit.
Correction:
The subpacketization grows exponentially in . This is the main practical limitation of coded caching. For a file of bits, we need for the scheme to operate. Research on reducing subpacketization includes: placement delivery arrays (Yan et al., 2017), combinatorial designs, and grouping users into clusters of manageable size. With clustering, the gain is where is the cluster size.
Quick Check
In the MAN scheme with files, users, and files per cache (), how many multicast transmissions are needed in the delivery phase?
transmissions, each serving 2 users
5 transmissions, one per user
1 transmission serving all 5 users
With , each transmission is an XOR addressed to users. There are such subsets. The delivery load is files.
Coded multicasting gain
The factor by which coded caching reduces the delivery load compared to uncoded caching. Represents the number of users simultaneously served by each coded multicast transmission.
Related: Coded caching
Subpacketization
The number of sub-files each file is split into during the placement phase. In the MAN scheme, this is , which grows exponentially and is the main practical limitation.
Related: Coded caching
Key Takeaway
Coded caching converts memory into bandwidth. The MAN scheme achieves delivery load , which is the uncoded load divided by the coded multicasting gain . This gain is global (scales with ) rather than local (dependent only on ). The price is subpacketization: each file must be split into sub-files, which grows exponentially and is the main practical barrier.