Ferkans — Interactive Telecom Tutor

Why Coded Caching?

The explosive growth of video streaming — Netflix, YouTube, and their successors — has made content delivery the dominant source of network traffic. The standard solution is uncoded caching: place popular files at edge caches close to users, so that frequently requested content can be served locally without burdening the backhaul. This is purely a local gain: each cached file benefits only the user who happens to request it.

In 2014, Maddah-Ali and Niesen discovered something remarkable: by designing the cache placement jointly across all users, the server can create coded multicast opportunities during delivery. The result is a global caching gain that scales with the number of users $K$ , not just with the cache size $M$ . A single coded transmission simultaneously satisfies the demands of multiple users, each of whom can decode its desired content using its cached side information.

The point is that caching is not just about storing popular files — it is about creating side information that enables efficient coded delivery. This is an information-theoretic insight with deep connections to index coding, broadcast channels, and network information theory.

Definition:
The Maddah-Ali-Niesen Caching Model

A server stores $N$ files $W_1, W_2, \ldots, W_N$ , each of size $F$ bits. There are $K$ users, each with a cache of size $MF$ bits ( $M \in [0, N]$ ). The system operates in two phases:

Phase 1 — Placement (off-peak hours): The server fills the users' caches without knowledge of future demands. User $k$ 's cache content is $Z_k = \phi_k(W_1, \ldots, W_N)$ , where $\phi_k$ is the caching function satisfying $|Z_k| \le MF$ .

Phase 2 — Delivery (peak hours): Each user $k$ reveals its demand $d_k \in [N]$ . The server transmits a coded message $X = \psi(W_1, \ldots, W_N, d_1, \ldots, d_K)$ of size $RF$ bits over a shared error-free link, where $R$ is the delivery load (in file units). User $k$ must decode $W_{d_k}$ from $(X, Z_k)$ .

The goal is to minimize the worst-case delivery load: $R^*(M) = \min_{\phi, \psi} \max_{(d_1, \ldots, d_K)} R$ over all placement and delivery strategies.

The model assumes: (1) the shared link is error-free with unlimited bandwidth (we remove this assumption in Section 27.3), (2) placement is done without knowledge of future demands (this is critical — if demands were known during placement, the problem would be trivial), and (3) all files have equal size.

Coded caching

A caching strategy where the cache placement is designed to create multicast coding opportunities during the delivery phase, achieving a global caching gain that scales with the number of users.

Cache placement

The first phase of a caching protocol, where the server fills user caches without knowledge of future demands. In coded caching, the placement is designed to maximize multicast opportunities during delivery.

Related: Coded caching

Coded Caching: The $N=K=2$ Example

Step-by-step animation of coded caching with 2 files and 2 users. The placement phase splits files and distributes halves, then the delivery phase uses a single XOR multicast to serve both users simultaneously --- a 4x reduction in delivery load.

Example: Coded Caching with $N=K=2$ , $M=1$

A server has $N = 2$ files $A, B$ . There are $K = 2$ users, each with cache size $M = 1$ (one file). Compare uncoded and coded caching.

Solution

Uncoded caching

Placement: User 1 caches $A$ , User 2 caches $B$ (or any fixed assignment).

Delivery: If user 1 wants $B$ and user 2 wants $A$ , the server must send both files: $R_{\text{uncoded}} = 2$ . But user 1 already has $A$ and user 2 already has $B$ ! We are sending information that the users already possess.

Coded caching

Placement: Split each file into two halves: $A = (A_1, A_2)$ , $B = (B_1, B_2)$ .

User 1 caches $A_1, B_1$ (first half of each file)
User 2 caches $A_2, B_2$ (second half of each file)

Cache size: $2 \times F/2 = F = MF$ . Correct.

Delivery: If user 1 wants $B$ (needs $B_2$ ) and user 2 wants $A$ (needs $A_1$ ): Server sends $A_1 \oplus B_2$ (XOR of the two missing halves).

User 1 recovers $B_2 = (A_1 \oplus B_2) \oplus A_1$ (since it has $A_1$ ).
User 2 recovers $A_1 = (A_1 \oplus B_2) \oplus B_2$ (since it has $B_2$ ).

Delivery load: $R_{\text{coded}} = 1/2$ file. That is a $4\times$ reduction!

Key insight

The coded placement creates symmetric side information: each user has something the other needs. The XOR simultaneously serves both users. This is the simplest instance of the coded multicasting gain.

Definition:
The MAN Scheme

For integer-valued $t = KM/N \in \{0, 1, \ldots, K\}$ , the Maddah-Ali-Niesen (MAN) scheme operates as follows:

Placement: Each file $W_n$ is split into $\binom{K}{t}$ equal-sized sub-files: $W_n = \{W_{n,\mathcal{T}} : \mathcal{T} \subset [K],\; |\mathcal{T}| = t\}$ User $k$ caches all sub-files where $k \in \mathcal{T}$ : $Z_k = \{W_{n,\mathcal{T}} : k \in \mathcal{T},\; n \in [N]\}.$ Cache size verification: user $k$ stores $N \binom{K-1}{t-1}$ sub-files of size $F/\binom{K}{t}$ , totaling $N \cdot \frac{\binom{K-1}{t-1}}{\binom{K}{t}} \cdot F = \frac{t}{K} \cdot NF = MF$ . Correct.

Delivery: For each subset $\mathcal{S} \subset [K]$ with $|\mathcal{S}| = t+1$ , the server transmits: $\bigoplus_{k \in \mathcal{S}} W_{d_k, \mathcal{S} \setminus \{k\}}$

This is an XOR of $t+1$ sub-files. Each user $k \in \mathcal{S}$ has all sub-files except $W_{d_k, \mathcal{S} \setminus \{k\}}$ (which it needs), so it can decode.

The delivery load is: $R_{\text{MAN}}(M) = \binom{K}{t+1} \cdot \frac{1}{\binom{K}{t}} = \frac{K-t}{t+1} = \frac{K(1 - M/N)}{1 + KM/N}.$

Theorem: Delivery Load of the MAN Scheme

The MAN coded caching scheme achieves a delivery load of:

$R_{\text{MAN}}(M) = \frac{K(1 - M/N)}{1 + KM/N}$

for $M = tN/K$ with $t \in \{0, 1, \ldots, K\}$ , and by memory sharing (convex combination) for non-integer $t$ . This can be decomposed as:

$R_{\text{MAN}}(M) = \underbrace{K(1 - M/N)}_{\text{uncoded caching load}} \cdot \underbrace{\frac{1}{1 + KM/N}}_{\text{coded multicasting gain}}$

The first factor is the load that uncoded caching would achieve (each user's missing fraction times the number of users). The second factor is the coded multicasting gain: $t + 1 = 1 + KM/N$ users are simultaneously served by each multicast message.

The coded multicasting gain $1 + KM/N$ grows linearly with the number of users $K$ (at fixed $M/N$ ). This is the fundamental surprise of coded caching: adding more users does not increase the load proportionally; instead, each new user contributes to more multicast opportunities. With $K = 100$ users and $M/N = 0.1$ (each user caches 10% of the library), the coded gain is $1 + 10 = 11$ , meaning each transmission serves 11 users simultaneously.

Proof

Number of transmissions

There are $\binom{K}{t+1}$ subsets of size $t+1$ . Each transmission is one XOR of $t+1$ sub-files, each of size $F/\binom{K}{t}$ bits. Total load: $R = \binom{K}{t+1}/\binom{K}{t} = (K-t)/(t+1)$ .

Decodability

User $k \in \mathcal{S}$ needs sub-file $W_{d_k, \mathcal{S}\setminus\{k\}}$ . For every other $j \in \mathcal{S} \setminus \{k\}$ , user $k$ has cached $W_{d_j, \mathcal{S}\setminus\{j\}}$ because $k \in \mathcal{S}\setminus\{j\}$ and $|\mathcal{S}\setminus\{j\}| = t$ . So user $k$ can cancel all other terms in the XOR and recover its desired sub-file.

Memory sharing for non-integer $t$

For non-integer $t = KM/N$ , write $t = \alpha t_1 + (1-\alpha)t_2$ with $t_1 = \lfloor t \rfloor$ , $t_2 = \lceil t \rceil$ . Split each file into two parts with fractions $\alpha$ and $1-\alpha$ , and apply the MAN scheme with parameters $t_1$ and $t_2$ respectively. The resulting load is $R(M) = \alpha R(t_1 N/K) + (1-\alpha)R(t_2 N/K)$ , which is the lower convex envelope of the integer- $t$ points.

Memory-Load Tradeoff: Coded vs Uncoded Caching

Compare the delivery load $R(M)$ of coded caching (MAN scheme) with uncoded caching as a function of cache size $M$ . Observe how the coded multicasting gain increases with the number of users $K$ .

Parameters

Number of files

N

20

Number of users

K

10

Historical Note: The Discovery of Coded Caching

2012-2014

Mohammad Ali Maddah-Ali and Urs Niesen, both at Bell Labs, published their coded caching result in 2014, though the first version appeared on arXiv in 2012. The paper was initially met with skepticism: the idea that caching could provide a gain proportional to $K$ (not just $M$ ) seemed too good to be true. But the proof was elegantly simple — a counting argument showing that the XOR structure works. The result opened a new subfield of information theory, generating hundreds of papers on variations: decentralized placement, heterogeneous caches, online caching, multi-server systems, and the wireless extensions we study in Section 27.3. Caire's group at TU Berlin made fundamental contributions to the wireless extensions, connecting coded caching to MIMO broadcasting and D2D communication.

Common Mistake: The Subpacketization Bottleneck

Mistake:

Assuming the MAN scheme is practical for large $K$ . With $K = 100$ users and $t = 10$ , each file must be split into $\binom{100}{10} \approx 1.7 \times 10^{13}$ sub-files. For a 1 GB file, each sub-file would be less than 1 bit.

Correction:

The subpacketization $\binom{K}{t}$ grows exponentially in $K$ . This is the main practical limitation of coded caching. For a file of $F$ bits, we need $F \ge \binom{K}{t}$ for the scheme to operate. Research on reducing subpacketization includes: placement delivery arrays (Yan et al., 2017), combinatorial designs, and grouping users into clusters of manageable size. With clustering, the gain is $1 + K'M/N$ where $K' \ll K$ is the cluster size.

Quick Check

In the MAN scheme with $N = 10$ files, $K = 5$ users, and $M = 2$ files per cache ( $t = KM/N = 1$ ), how many multicast transmissions are needed in the delivery phase?

$\binom{5}{2} = 10$ transmissions, each serving 2 users

5 transmissions, one per user

1 transmission serving all 5 users

Correction:

\binom{5}{2} = 10

transmissions, each serving 2 users

With $t = 1$ , each transmission is an XOR addressed to $t+1 = 2$ users. There are $\\binom{K}{t+1} = \\binom{5}{2} = 10$ such subsets. The delivery load is $R = 10/\\binom{5}{1} = 10/5 = 2$ files.

Coded multicasting gain

The factor $1 + KM/N$ by which coded caching reduces the delivery load compared to uncoded caching. Represents the number of users simultaneously served by each coded multicast transmission.

Related: Coded caching

Subpacketization

The number of sub-files each file is split into during the placement phase. In the MAN scheme, this is $\binom{K}{t}$ , which grows exponentially and is the main practical limitation.

Related: Coded caching

Key Takeaway

Coded caching converts memory into bandwidth. The MAN scheme achieves delivery load $R = K(1-M/N)/(1+KM/N)$ , which is the uncoded load divided by the coded multicasting gain $1+KM/N$ . This gain is global (scales with $K$ ) rather than local (dependent only on $M$ ). The price is subpacketization: each file must be split into $\binom{K}{t}$ sub-files, which grows exponentially and is the main practical barrier.

The Coded Caching Problem

Why Coded Caching?

Definition: The Maddah-Ali-Niesen Caching Model

Coded caching

Cache placement

Coded Caching: The N=K=2N=K=2N=K=2 Example

Example: Coded Caching with N=K=2N=K=2N=K=2, M=1M=1M=1

Uncoded caching

Coded caching

Key insight

Definition: The MAN Scheme

Theorem: Delivery Load of the MAN Scheme

Number of transmissions

Decodability

Memory sharing for non-integer $t$

Memory-Load Tradeoff: Coded vs Uncoded Caching

Parameters

Historical Note: The Discovery of Coded Caching

Common Mistake: The Subpacketization Bottleneck

Quick Check

Coded multicasting gain

Subpacketization

Key Takeaway

Definition:
The Maddah-Ali-Niesen Caching Model

Coded Caching: The $N=K=2$ Example

Example: Coded Caching with $N=K=2$ , $M=1$

Definition:
The MAN Scheme