The Demand Privacy Model

Why Privacy Matters in Coded Caching

Traditional coded caching has a subtle privacy issue: in the MAN delivery, each user must know the full demand vector d=(d1,…,dK)\mathbf{d} = (d_1, \ldots, d_K) to decode its own file. User 1 XOR-cancels using Wd2,SW_{d_2, \mathcal{S}} from its cache β€” but to do so, it must know the file index d2d_2. User 1 therefore learns what users 2, 3, ..., K are watching.

For a CDN operator, this may seem innocuous (users knowing each other's anonymous demands). But in real deployments, demands can reveal personal information: viewing habits, medical searches, political preferences. Information-theoretic privacy requires that the delivery mechanism leaks zero information about others' demands to any individual user.

Remarkably, the Wan-Caire CommIT result shows that demand privacy can be achieved at zero rate cost in the shared-link setting. The MAN rate R=K(1βˆ’M/N)/(1+KM/N)R = K(1-M/N)/(1+KM/N) remains achievable, with privacy added via shared randomness. This is the central message of Chapter 12.

Definition:

Information-Theoretic Demand Privacy

A coded caching scheme achieves demand privacy if, for every user kk, the received delivery message XdX_{\mathbf{d}} together with user kk's cache Zk\mathcal{Z}_k and own demand dkd_k is statistically independent of the other users' demands dβˆ’k=(d1,…,dkβˆ’1,dk+1,…,dK)\mathbf{d}_{-k} = (d_1, \ldots, d_{k-1}, d_{k+1}, \ldots, d_K): I(dβˆ’k;β€…β€ŠXd,Zk∣dk)β€…β€Š=β€…β€Š0,βˆ€k∈[K].I(\mathbf{d}_{-k};\; X_{\mathbf{d}}, \mathcal{Z}_k \mid d_k) \;=\; 0, \quad \forall k \in [K]. Equivalently: user kk's posterior over dβˆ’k\mathbf{d}_{-k} given its observations equals the prior.

This is a strict information-theoretic privacy definition. No amount of computation can reveal dβˆ’k\mathbf{d}_{-k} from XdX_{\mathbf{d}} and Zk\mathcal{Z}_k. It is stronger than cryptographic privacy (which only limits polynomial-time adversaries).

Definition:

Adversary Models

The privacy guarantee depends on who the adversary is:

  1. Single-user adversary. One user kk tries to learn dβˆ’k\mathbf{d}_{-k} from its cache + received message. (Most common setting; Wan-Caire 2021.)
  2. Coalition of colluding users. A group ZβŠ†[K]\mathcal{Z} \subseteq [K], ∣Z∣=z|\mathcal{Z}| = z, jointly tries to learn demands of non-colluding users. (D2D setting; Wan-Sun-Ji-Tuninetti-Caire.)
  3. Server adversary. The server itself tries to learn users' demands. In the shared-link model, the server has the demand vector by construction, so this is not applicable. In D2D or cloud-RAN models, it can matter.
  4. Eavesdropper adversary. An outside observer who sees the broadcast messages but not any cache. Simpler to analyze; typically achieved by trivial encryption.

Chapter 12 focuses on (1) and (2) β€” the interesting cases.

Why the MAN Scheme Leaks

In the standard MAN delivery of Chapter 2, user 1's decoding of Wd1W_{d_1} from XSβ€²=⨁k∈Sβ€²Wdk,Sβ€²βˆ–{k}X_{\mathcal{S}'} = \bigoplus_{k \in \mathcal{S}'} W_{d_k, \mathcal{S}' \setminus \{k\}} requires knowing the indices {dk:k∈Sβ€²}\{d_k : k \in \mathcal{S}'\}. From these indices, user 1 learns exactly what other users are watching. Concretely:

  • User 1 sees the message Wd2,{1,3}βŠ•Wd3,{1,2}βŠ•Wd1,{2,3}W_{d_2, \{1,3\}} \oplus W_{d_3, \{1,2\}} \oplus W_{d_1, \{2,3\}} for Sβ€²={1,2,3}\mathcal{S}' = \{1, 2, 3\}.
  • To recover Wd1,{2,3}W_{d_1, \{2,3\}}, user 1 needs to XOR out Wd2,{1,3}W_{d_2, \{1,3\}} and Wd3,{1,2}W_{d_3, \{1,2\}}.
  • User 1 knows which file each summand is (it read the "label" d2,d3d_2, d_3 to find the right subfiles in its cache).

Hence the MAN scheme leaks dβˆ’k\mathbf{d}_{-k} to each user β€” not zero leakage.

The Wan-Caire scheme avoids this by masking the summand labels with shared randomness. We develop this in Β§12.2.

Example: Quantifying Leakage in MAN

For K=10K = 10, N=100N = 100, under standard MAN delivery with user 1 observing XdX_{\mathbf{d}} and its cache, compute the leakage I(dβˆ’1;Xd,Z1)I(\mathbf{d}_{-1}; X_{\mathbf{d}}, \mathcal{Z}_1).

Definition:

Shared Randomness

The Wan-Caire private scheme relies on shared randomness β€” a secret key K\mathcal{K} known to all users but unknown to an external adversary. Specifically, users receive a random permutation (or similar structure) via a pre-shared key. The server uses this randomness to mask the file-index labels in the delivery messages, so that users cannot identify which file is which without the shared key.

The key K\mathcal{K} is pre-distributed offline (during placement) at zero delivery-phase cost. Each user holds its own share of the key.

Practically, shared randomness is established via secure key exchange protocols (Diffie-Hellman, PKI). Costs are operational (key management) rather than communication-rate.

Private coded caching setup

Private coded caching setup
Server holds library W\mathcal{W}. Each user has a cache Zk\mathcal{Z}_k and a share of shared randomness Kk\mathcal{K}_k. The server computes delivery Xd,KX_{\mathbf{d}, \mathcal{K}} β€” masked by shared randomness β€” broadcast to all users. Each user decodes using its cache and key share; cannot learn others' demands.

Key Takeaway

Demand privacy is free in the information-theoretic sense. The Wan-Caire 2021 result establishes that the MAN rate R=K(1βˆ’M/N)/(1+KM/N)R = K(1-M/N)/ (1+KM/N) is achievable with zero leakage about other users' demands. The cost is only in the shared-randomness setup, which is asymptotically negligible for large file sizes. This is a remarkable positive result: privacy and rate are not in tension.

⚠️Engineering Note

Information-Theoretic vs Cryptographic Privacy

A key distinction:

  1. Information-theoretic privacy. Adversary has unbounded computation. Security is absolute β€” no amount of computation can break it. Requires shared randomness as long as the private content.
  2. Cryptographic privacy. Adversary bounded to polynomial time. Security assumes hardness of some math problem (factoring, discrete log, etc.). Public-key encryption, RSA, etc. require no shared randomness but are breakable with enough compute.

Information-theoretic privacy is stronger but more expensive (shared randomness per message). For coded caching, the Wan- Caire scheme uses the fact that shared randomness is essentially free in the placement phase β€” users establish keys offline; keys cost no delivery bandwidth.

For applications where post-quantum security matters (where cryptographic assumptions might eventually fail), information- theoretic privacy is preferred despite the shared-randomness overhead.

Practical Constraints
  • β€’

    Shared randomness ~ K log(N) bits per delivery round

  • β€’

    Pre-distributed via key exchange (Diffie-Hellman typical)

  • β€’

    Wan-Caire achieves zero leakage with no delivery overhead

  • β€’

    Cryptographic schemes (AES, RSA) easier but break under quantum adversary