Fundamental Limits of Shared-Link Coded Caching
Is the MAN Scheme Optimal?
The MAN scheme achieves a delivery load of . But is this the best possible? Can a cleverer placement or a more sophisticated coding scheme do better?
For the case of uncoded placement (each user stores uncoded subsets of the files), the answer is: the MAN scheme is exactly optimal for the worst-case demand. This was proved by Wan, Tuninetti, and Piantanida (2020) via a clever induction argument. For general (possibly coded) placement, the MAN scheme is optimal within a constant factor of 2 (Yu and Maddah-Ali, 2018). The gap between coded and uncoded placement is still an open problem in general.
In this section, we present the converse for uncoded placement, which reveals the combinatorial structure underlying the optimality of the MAN scheme.
Memory-Load Tradeoff: Coded vs Uncoded
Definition: Uncoded Cache Placement
Uncoded Cache Placement
A cache placement is called uncoded if each user stores a subset of the file bits without any coding:
where is the -th bit of file . In other words, each cached bit is a bit of some file, not a coded combination of bits from different files.
The MAN placement is uncoded: each user stores sub-files where , which are just subsets of file bits.
Uncoded placement is a restriction, but a natural one: practical caching systems typically store file fragments, not coded combinations. Coded placement (e.g., storing ) can sometimes reduce the delivery load, but the gains are marginal (at most a factor of 2) and the complexity is substantially higher.
Theorem: Converse for Uncoded Cache Placement
For the coded caching problem with files, users, cache size , and uncoded cache placement, the worst-case delivery load satisfies:
Combined with the achievability of the MAN scheme, this establishes:
for and uncoded placement.
The converse says: any uncoded placement creates a side-information structure that is at best as useful as the MAN symmetric placement. The symmetric placement (where every -subset of users caches the same fraction) is the most "balanced" way to distribute side information, and any deviation from this balance can only hurt the worst-case delivery load.
Induction on $(K, t)$
The proof proceeds by induction on the number of users . The base case is trivial: . For the inductive step, consider users and fix a worst-case demand pattern where all users request distinct files.
Lower bound via acyclic index coding
Given the uncoded placement and a demand pattern, the delivery problem reduces to an index coding problem. The server has messages (the missing sub-files for each user), and each user has side information (its cached sub-files). The index coding converse gives:
for any ordering of users. Averaging over orderings and using the symmetry of the MAN placement yields the desired bound.
Tightness argument
The bound is achieved by the MAN delivery scheme, which precisely encodes the missing sub-files using XOR operations that respect the side-information structure. The combinatorial identity confirms the exact match.
Example: Converse Verification for , ,
Verify the converse bound for and (). Show that any uncoded placement requires delivery load .
MAN scheme achievability
.
Converse argument
With , each user caches bits out of total bits. In the worst case, all users request different files: .
User 1 needs bits of file 1, minus what it has cached from file 1. With uncoded placement, user 1 caches at most bits distributed across the three files.
By a counting argument: the total number of "useful" cached bits (bits that help satisfy demands) is at most (each bit can be useful for at most one user). So the server must deliver at least bits. But with coding, the load can be reduced to (by the index coding gain). The MAN bound is tight.
Theorem: Converse with Coded Placement (Yu-Maddah-Ali)
For the coded caching problem with arbitrary (possibly coded) placement, the worst-case delivery load satisfies:
for . This means the MAN scheme with uncoded placement is optimal within a factor of 2, even when compared to schemes that use arbitrarily complex coded placement.
This is a remarkable result: coded placement cannot improve by more than a factor of 2 over uncoded placement. The intuition is that the fundamental limitation comes from the structure of the demand (which files are requested), not the form of the cache content. Coded placement can create more efficient side information, but the improvement is bounded because the multicast opportunities are ultimately constrained by the demand combinatorics.
Cut-set argument
For any subset of users and any demand where users in request distinct files: (the delivery plus cache must cover all demanded files). This gives (trivial).
Refined bound via submodularity
The refined bound uses the submodularity of entropy and a careful accounting of how many bits of each file are "useful" across different demand patterns. By averaging over all worst-case demand patterns (where all users request distinct files), the factor-of-2 gap emerges from a combinatorial counting argument.
Coded Caching: Achievability and Converse Bounds
Compare the MAN achievable load, the uncoded placement converse, the Yu-Maddah-Ali general converse, and the uncoded caching baseline.
Parameters
Definition: Demand-Private Coded Caching
Demand-Private Coded Caching
In demand-private coded caching, the delivery message must not reveal any user's demand to the other users. Formally, for any two demand vectors and that differ only in user 's demand:
where denotes the caches of all users except .
Wan and Caire showed that demand privacy can be achieved with only a small increase in delivery load: the private scheme achieves , and the gap vanishes as .
Demand-Private Coded Caching
This work introduces the demand privacy constraint to coded caching. The main result is that the coded multicasting gain can be preserved even when each user's demand must be hidden from the others. The key technique is "virtual users" padding that masks the demand structure without significantly increasing the load.
Common Mistake: The Regime Is Different
Mistake:
Applying the MAN converse formula when (fewer files than users). In this regime, some users must request the same file, and the delivery load can be lower.
Correction:
When , the worst case is no longer all-distinct demands (which is impossible). The optimal scheme must account for demand collisions: if two users request the same file and one has cached different parts, the coded multicast can exploit this. The MAN scheme remains a good starting point but is not exactly optimal for . The optimal memory-load tradeoff for is fully characterized only for specific regimes.
Quick Check
The MAN scheme with uncoded placement is optimal (under the uncoded placement constraint) for all . What is the key property of the MAN placement that makes it optimal?
It caches the most popular files
It creates the most symmetric side-information structure across users
It uses coded placement to compress the cached content
The MAN placement ensures that every -subset of users caches the same fraction of every file. This maximal symmetry creates the maximum number of multicast opportunities for any demand pattern.
Practical Coded Caching: From Theory to Systems
Despite the elegant theory, practical deployment of coded caching faces challenges:
- Subpacketization: grows combinatorially. For : sub-files per file. Manageable for 1 GB files, but is intractable.
- Asynchronous demands: Users don't all reveal demands simultaneously. Online coded caching (where the server must respond before seeing all demands) loses some of the multicast gain.
- Heterogeneous caches: Not all users have the same cache size. Optimal schemes for heterogeneous are more complex but exist.
- File popularity: In practice, some files are much more popular than others (Zipf distribution). Popularity-aware coded caching combines coded placement for popular files with uncoded caching for less popular files.
Current research focuses on reducing subpacketization via combinatorial designs and grouping strategies, which trade some multicast gain for practical feasibility.
- β’
Subpacketization must satisfy bits per file
- β’
Practical video files: 0.5-10 GB (enough for typically)
- β’
CDN cache sizes: 1-100 TB per edge server
Key Takeaway
The MAN scheme is optimal under uncoded placement. For , the delivery load is the exact minimum under uncoded cache placement. Even with arbitrary coded placement, the load cannot be reduced by more than a factor of 2. File correlation reduces the effective library size, and demand privacy can be achieved with negligible overhead. The main practical barrier is subpacketization, not optimality.