Placement Phase: Combinatorial Subfile Assignment

Why This Particular Placement?

The MAN placement is a combinatorial design: each subfile goes to exactly tt caches, chosen by the identity of the subset S\mathcal{S}. One could imagine many other placements (random, content-aware, hierarchical) but the MAN choice is special for a precise reason: it is the unique (up to symmetries) uncoded placement that equalizes, across users, the number of subfiles held by every pair. This symmetry is what makes the delivery phase so clean.

A designer could complain that the placement is rigid β€” it must be exactly right for the delivery to work. Chapters 14 and 18 show alternative placements (PDAs, multi-access, decentralized) that relax this rigidity at the cost of different tradeoffs.

MAN Placement Algorithm

Complexity: Storage per user: ∣Zk∣=Nβ‹…(Kβˆ’1tβˆ’1)|\mathcal{Z}_k| = N \cdot \binom{K-1}{t-1} subfiles, each of size F/(Kt)F/\binom{K}{t} bits. Total: MFMF bits. βœ“ Off-peak bandwidth: KMFKM F bits (every subfile transmitted to the tt users that cache it).
Input: Library {W1,…,WN}\{W_1, \ldots, W_N\} with each Wn∈F2FW_n \in \mathbb{F}_2^F; parameters K,tK, t with t=KM/N∈{0,1,…,K}t = KM/N \in \{0, 1, \ldots, K\}.
Output: Caches Z1,…,ZK\mathcal{Z}_1, \ldots, \mathcal{Z}_K.
1. Compute Fsub←F/(Kt)F_{\text{sub}} \leftarrow F / \binom{K}{t}. // subfile size in bits
2. for each file WnW_n, n=1,…,Nn = 1, \ldots, N do
3. \quad Partition WnW_n into (Kt)\binom{K}{t} equal-sized blocks,
4. \quad labeled Wn,SW_{n,\mathcal{S}} for each SβŠ†[K]\mathcal{S} \subseteq [K] with ∣S∣=t|\mathcal{S}| = t.
5. end for
6. for k=1k = 1 to KK do
7. Zk←{ Wn,S:n∈[N],β€…β€ŠSβˆ‹k,β€…β€Šβˆ£S∣=t }\quad \mathcal{Z}_k \leftarrow \{\, W_{n,\mathcal{S}} : n \in [N],\; \mathcal{S} \ni k,\; |\mathcal{S}| = t \,\}
8. end for
9. return (Z1,…,ZK)(\mathcal{Z}_1, \ldots, \mathcal{Z}_K).

Theorem: Cache Size Consistency

Under MAN placement, each user kk caches exactly Mβ‹…FM \cdot F bits, i.e., MM files' worth of content, as required.

The Symmetry That Matters

Two critical symmetries of the MAN placement:

Subfile-level symmetry. Every subfile Wn,SW_{n,\mathcal{S}} is cached by exactly tt users. This is visible in the placement heatmap as row-wise balance.

User-level symmetry. Every user caches exactly (Kβˆ’1tβˆ’1)\binom{K-1}{t-1} subfiles per file. This is the column-wise balance.

Together, these two symmetries imply that for every pair (k,S)(k, \mathcal{S}) with kβˆ‰Sk \notin \mathcal{S} and ∣S∣=t|\mathcal{S}| = t, user kk does not have subfile Wn,SW_{n,\mathcal{S}}. When user kk requests file WdkW_{d_k}, the missing subfiles it needs are exactly those indexed by tt-subsets that do not contain kk. There are (Kβˆ’1t)\binom{K-1}{t} such subsets. This count will reappear in the delivery analysis.

Example: Example Cache Contents: K=4K = 4, t=2t = 2

For K=4K = 4 users and t=2t = 2 (implying M/N=1/2M/N = 1/2), enumerate user 1's cache contents for each of NN files.

Key Takeaway

User kk's cache content is uniform over its support, because the subfiles are mutually independent (different files are independent, same-file different-subfile are too under our assumption H(Wn)=F\mathrm{H}(W_n) = F). Hence H(Zk)=MF\mathrm{H}(\mathcal{Z}_{k}) = MF bits exactly β€” the cache budget is fully used with no slack.

πŸ”§Engineering Note

The Off-Peak Cost of Placement

A subtle point in the placement analysis: the MAN scheme requires the server to transmit every subfile Wn,SW_{n,\mathcal{S}} to each of the tt users in S\mathcal{S}. Total off-peak traffic: PlacementΒ bitsβ€…β€Š=β€…β€ŠNβ‹…(Kt)β‹…tβ‹…F(Kt)β€…β€Š=β€…β€ŠNtFβ€…β€Š=β€…β€ŠKMF.\text{Placement bits} \;=\; N \cdot \binom{K}{t} \cdot t \cdot \frac{F}{\binom{K}{t}} \;=\; N t F \;=\; KMF. That is, one full library's worth for each user β€” the cache is entirely populated, as it must be.

In real CDN deployments the placement is done over a long period (days) and the delivery savings amortize over many delivery rounds. The economic argument for coded caching depends on the ratio of these two time scales.

Practical Constraints
  • β€’

    Placement cost: KMFKMF bits per library refresh

  • β€’

    Delivery cost per round: K(1βˆ’M/N)/(1+KM/N)β‹…FK(1-M/N)/(1+KM/N) \cdot F bits

  • β€’

    Break-even: ratio of peak rounds to library refresh frequency must be large

Common Mistake: All Caches Are Identical in Structure, Different in Content

Mistake:

Saying 'all users cache the same subfiles' because the scheme is symmetric.

Correction:

The structure is identical β€” every user caches subfiles whose index sets contain that user, and every user caches the same number (N(Kβˆ’1tβˆ’1))(N \binom{K-1}{t-1}) of subfiles. But the specific subfiles differ: user 1 caches Wn,{1,2}W_{n,\{1,2\}}; user 2 caches Wn,{1,2}W_{n,\{1,2\}} as well (since 2∈{1,2}2 \in \{1,2\}); user 3 does not cache Wn,{1,2}W_{n,\{1,2\}}. The combinatorial interleaving of caches is what lets the delivery XOR messages work.