Ferkans — Interactive Telecom Tutor

What Coded Caching Gets You — A Preview

We close Chapter 1 with a preview of the promise of coded caching. Suppose we allow ourselves two things that uncoded caching forbids:

Coded placement: each user's cache can hold combinations of file fragments, not entire files.
Coded delivery: the server's transmission can be a function of multiple files, not individual unicasts.

The Maddah-Ali–Niesen theorem (Chapter 2) says that with a particular combinatorial placement and a particular XOR-based delivery, the worst-case delivery rate satisfies $R_{\text{MAN}}(M) \;=\; K \cdot \frac{1 - M/N}{1 + K M/N}.$ Compare with the uncoded bound $K(1 - M/N)$ : the MAN scheme divides the uncoded rate by $1 + t$ where $t = K M/N$ .

The parameter $t$ — the coded caching gain — is the golden thread of this book. Every chapter will touch it.

Definition:
Coded Caching Gain

The coded caching gain (or global caching gain) is $t \;\triangleq\; \frac{K M}{N} \;=\; K \cdot \mu.$ In the MAN scheme, each coded multicast message simultaneously satisfies $t + 1$ users. For integer $t$ this is exact; for non-integer $t$ the scheme is defined by memory-sharing (convex combination) between the two nearest integers.

$t$ captures the aggregate cache in the network, measured in library units: with $K$ users each storing $M$ files, the total cached content is $K M$ files. Dividing by the library size $N$ gives a dimensionless quantity — the number of "library copies" distributed across the network.

Key Takeaway

The coded caching gain is the whole story. Uncoded caching gives you $(1 - M/N)$ — a per-user saving. Coded caching additionally gives you $(1 + t)^{-1}$ — a multiplicative saving that grows with the number of users. For large networks, the coded gain dominates.

The Memory–Load Tradeoff (Preview)

Preview of the memory–load tradeoff curve that will dominate Chapters 2–3. The blue curve is the MAN achievable rate; the red dashed line is the uncoded bound. The ratio between the two curves — $(1 + K M/N)$ — is the coded caching gain. Notice that the gain grows as $K$ grows: doubling the user population doubles the gain at fixed $M/N$ .

Parameters

Number of users K10

Number of files N10

Why Coding Helps — A Capsule Argument

Here is the capsule intuition for why coded caching can beat uncoded caching. Consider $K = 2$ users and $N = 2$ files, each of one bit. Each user has a cache of $M = 1$ bit.

Uncoded: User 1 caches $W_1$ , user 2 caches $W_2$ . If demands are $(W_2, W_1)$ (worst case), the server must send two bits to each user — total rate $R = 2$ . If demands are $(W_1, W_2)$ (lucky), rate $R = 0$ . Worst-case $R_{\text{unc}} = 2 \cdot (1 - 1/2) = 1$ . But note: if user 1's cache already has $W_1$ , placing $W_1$ in its cache was wasted — it still needs $W_2$ on a miss.

Coded: Split each file into halves: $W_1 = (a_1, a_2)$ , $W_2 = (b_1, b_2)$ . User 1 caches $(a_1, b_1)$ ; user 2 caches $(a_2, b_2)$ . For any demand pair, the server sends two XOR messages — say $a_2 \oplus b_1$ and... wait, this is not yet the MAN scheme. We need the full combinatorial structure of Chapter 2 to see how the bits interleave. But the spirit is here: by distributing partial information to each user, a single XOR multicast message can serve both users simultaneously.

Uncoded vs. Coded Caching at a Glance

Property	Uncoded (popularity) caching	Coded caching (MAN)
Placement granularity	Whole files	Subfiles (combinatorial)
Delivery format	Unicast per user	XOR multicast ( $t+1$ users per message)
Worst-case rate	$K(1-M/N)$	$K(1-M/N)/(1+KM/N)$
Gain over no caching	Local: $1-M/N$	Local $\times$ coded: $(1-M/N)/(1+KM/N)$
Scaling in $K$	Linear in $K$	Bounded as $K \to \infty$
Subpacketization	1 (whole files)	$\binom{K}{t}$ (exponential in $K$ )
Demand-dependent?	Depends on popularity distribution	Demand-agnostic (worst-case)
Optimal under own class?	Yes (within uncoded)	Optimal under uncoded placement (YMA '18)

Why This Matters: Coded Caching as an Information-Theoretic Primitive

Coded caching is more than a CDN trick — it is a fundamental information-theoretic primitive in which side information (the caches) reduces capacity requirements (the shared link). The general principle — that correlated side information across receivers lets a single transmission simultaneously satisfy many of them — connects to classical results in network information theory: the index coding problem (a close cousin, treated in Chapter 4), Gaussian broadcast with message sets (Chapter 5), and coded computing (Chapter 15). The shared-link model is the cleanest setting in which to develop the machinery.

⚠️Engineering Note

Practical Barriers We Will Address

Before you get too enthusiastic about a $10^5\times$ gain from the Netflix thought experiment, here are the practical barriers that will animate later chapters:

Subpacketization ( $\binom{K}{t}$ ). The MAN scheme requires each file to be split into $\binom{K}{t}$ pieces. For $K = 1000$ and $t = 100$ , $\binom{K}{t} \approx 10^{139}$ . No real system can operate with that many subfiles. Chapter 14 introduces polynomial- subpacketization schemes (PDAs, graph coloring).
Non-uniform demand / popularity. The MAN bound is for uniform demands, which is not real. Chapter 13 handles Zipf popularity and heterogeneous caches.
Wireless channel effects. Chapter 5 replaces the error-free shared link with a multi-antenna Gaussian BC. The gain formula becomes $\mathrm{DoF} = t + L$ , which is a much richer theory.
Privacy. With vanilla MAN, each user must learn the demands of all other users to decode. Chapter 12 shows how to preserve the gain while hiding demands.

The theory in this book always acknowledges these gaps. Caire's approach — shared throughout — is to state the idealized result sharply, then honestly report where practice departs from theory, and what work closes the gap.

Practical Constraints

•
MAN subpacketization is exponential in $K$
•
Demand distributions are non-uniform and time-varying
•
Wireless channels are noisy and frequency-selective
•
Privacy requires extra randomness (shared secret keys)

Quick Check

A coded caching system has $K = 20$ users, $N = 50$ files, per-user cache $M = 5$ files. What is the multiplicative gain factor of the MAN scheme over uncoded caching?

$1 + 5/50 = 1.1$

$1 + 20 \cdot 5/50 = 3$

$1 + 20/50 = 1.4$

$1 + 20 \cdot 5 = 101$

Correction:

1 + 20 \cdot 5/50 = 3

$t = KM/N = 20 \cdot 5 / 50 = 2$ , so the coded caching gain is $1 + t = 3$ . Each XOR message simultaneously serves $t+1 = 3$ users.

Toward Coded Caching: Preview of the Memory–Load Tradeoff