Cache-Aided PIR

From Side Information to Caches

Section §15.1 considered side info as a set of complete files. In practice, the user's cache often contains partial file content — a generic cache is some function Z=ϕ(W1,,WK)Z = \phi(W_1, \ldots, W_K) of the library, not necessarily full files.

Cache-aided PIR generalizes side-info PIR to arbitrary cache contents. The cache is populated in a placement phase before the user knows θ\theta; during delivery, the PIR protocol exploits whatever the cache contains.

The general capacity is open. We focus on two important regimes: (i) uncoded caching (the cache stores raw file segments), and (ii) MAN-style coded caching where multiple users share the placement design (relevant for §15.3, the CommIT contribution).

Definition:

Cache-Aided PIR Protocol

A cache-aided PIR protocol consists of:

  1. Placement phase (offline, before user knows θ\theta): databases or system generates a cache content ZZ of size Z=γKL|Z| = \gamma \cdot K \cdot L bits, where γ[0,1]\gamma \in [0, 1] is the cache fraction. The cache is delivered to the user.

  2. Delivery phase (online, after user picks θ\theta): user sends queries to databases; databases respond with answers. User combines cache and answers to decode WθW_\theta.

Privacy: I(θ;Q(θ,n))=0I(\theta; Q^{(\theta, n)}) = 0.

Cache-aided PIR rate (delivery rate): R(γ)  =  LD,R(\gamma) \;=\; \frac{L}{D}, where DD is the delivery-phase download. Note: this excludes the placement-phase transmission (which is amortized across queries).

Theorem: Cache-Aided PIR with Uncoded Prefetching (Wei–Banawan–Ulukus 2019)

For PIR with KK files, NN databases, and user cache ZZ of size γKL\gamma K L bits populated via uniformly-random uncoded prefetching (each bit independently cached with probability γ\gamma), the delivery-phase PIR capacity is Ccache-PIR(N,K,γ)  =  i=0K1(K1i)γi(1γ)K1iCPIR(N,Ki)C_{\text{cache-PIR}}(N, K, \gamma) \;=\; \sum_{i=0}^{K-1} \binom{K-1}{i} \gamma^i (1-\gamma)^{K-1-i} \cdot C_{\text{PIR}}(N, K - i) where CPIR(N,k)=(1+1/N++1/Nk1)1C_{\text{PIR}}(N, k) = (1 + 1/N + \cdots + 1/N^{k-1})^{-1}.

Operational interpretation: the capacity is a γ\gamma-binomial mixture of the KMK - M rates (the "effective KK" weighted by the probability that exactly ii of the K1K - 1 other files are in the cache).

Example: Cache-Aided PIR at N=4,K=10N = 4, K = 10

Compute the cache-aided PIR rate for γ{0,0.25,0.5,0.75,1}\gamma \in \{0, 0.25, 0.5, 0.75, 1\}.

Cache-Aided PIR: Cache Size vs. Rate

Plot the cache-aided PIR rate Ccache-PIR(N,K,γ)C_{\text{cache-PIR}}(N, K, \gamma) as a function of the cache fraction γ\gamma for fixed NN and KK. The curve is monotone in γ\gamma, starting from the classical Sun-Jafar value at γ=0\gamma = 0 and reaching 11 at γ=1\gamma = 1. The convexity reflects the binomial structure.

Parameters
5
10

Cache-Aided vs. Classical PIR — Operational Cases

SettingCache Fraction γ\gammaPIR RateUse Case
Cold start000.75\sim 0.75 (classical)First retrieval, no cache
Browser cache0.1\sim 0.10.78\sim 0.78Modest improvement
CDN edge cache0.5\sim 0.50.83\sim 0.83Significant improvement
Pinned/dedicated cache0.9\sim 0.90.99\sim 0.99Near-zero download cost
Cache hit11 (file in cache)11 (no PIR query needed)Local response only

Definition:

Cache-Aided PIR with Unknown Caches

In this variant, the databases do not know the user's cache content ZZ (or the placement strategy). The user must perform PIR-style queries that work for any realization of ZZ, with privacy of θ\theta against the databases.

The achievable rate matches the public-cache case for uniformly-random caches: Ccache-PIR-unknown(N,K,γ)=Ccache-PIR(N,K,γ)C_{\text{cache-PIR-unknown}}(N, K, \gamma) = C_{\text{cache-PIR}}(N, K, \gamma). This is because the user can encode the cache realization implicitly into the query structure.

For non-uniform / structured caches (e.g., Zipf-distributed popularity), the unknown-cache variant may incur a rate cost. The exact gap is open.

Common Mistake: Cache Fraction Counted in Different Units

Mistake:

Confuse the cache fraction γ\gamma (fraction of bits in the cache) with the side-info parameter MM (number of complete files in the cache).

Correction:

These are different parameterizations of the same idea. In §15.1 (side info), the user has MM complete files: Z=ML|Z| = M \cdot L, so γequiv=M/K\gamma_{\text{equiv}} = M / K. In §15.2 (cache), each bit is independently cached with probability γ\gamma: ZγKL|Z| \approx \gamma \cdot K \cdot L on average, but no file is stored in full. The capacity formulas differ! Side info gives C(N,KM)C(N, K - M) (effective-KK reduction). Cache-aided gives a binomial mixture. They coincide only at the extremes (γ=0\gamma = 0 or γ=1\gamma = 1 or M=0M = 0 or M=K1M = K - 1).

Bridge to Coded Caching (Book CC)

Cache-aided PIR borrows the placement-then- delivery structure from coded caching (Maddah-Ali & Niesen 2014; see Book CC). The key difference: in coded caching, the user's demand is public; in cache-aided PIR, the demand is private.

Coded caching uses a clever placement to enable coded delivery (broadcasting a XOR of multiple users' missing pieces). Cache-aided PIR borrows this idea but adds the privacy constraint, complicating the delivery design.

Section §15.3 explores the natural extension: cached coded delivery with multiple users and demand privacy. This is the CommIT group's contribution (Wan-Tuninetti-Caire 2021).

,
⚠️Engineering Note

Deploying Cache-Aided PIR

Practical guidelines for cache-aided PIR:

  • Cache placement strategy: uniformly- random caching achieves the rate Ccache-PIR(N,K,γ)C_{\text{cache-PIR}}(N, K, \gamma). Popularity-driven (Zipf) caching may improve average-case rate but loses worst-case privacy guarantees.

  • Cache size budget: typical browser caches: γ0.01\gamma \sim 0.01 to 0.10.1. Edge caches: γ0.5\gamma \sim 0.5. Pinned caches: γ0.9\gamma \sim 0.9+. The delivery rate scales correspondingly.

  • Privacy of cache content: the user's cache content is private from the databases — this is automatic if the placement was done long ago. Live cache updates require care to prevent leak via update timing.

  • Cache hit handling: pre-check the cache before issuing a PIR query. Cache hit → return locally (no PIR). Cache miss → run PIR with the remaining cache as side info.

  • Composition with coded caching: see §15.3 for the multi-user case with coded delivery + privacy.

Practical Constraints
  • Cache fraction: γ[0,1]\gamma \in [0, 1], typical 0.010.010.90.9

  • Uniform-random placement: achieves capacity formula

  • Cache hit: skip PIR, respond locally

  • Multi-user + privacy: see §15.3 (CommIT)

📋 Ref: Wei-Banawan-Ulukus 2019; CDN best practices

Key Takeaway

Cache-aided PIR uses user-side caches to reduce delivery-phase download. The capacity for uniformly-random uncoded prefetching is a γ\gamma-binomial mixture of effective-KK rates: C(N,K,γ)=E[C(N,KI)]C(N, K, \gamma) = \mathbb{E}[C(N, K-I)] with IBin(K1,γ)I \sim \text{Bin}(K-1, \gamma). The rate is monotone in γ\gamma and reaches 11 at γ=1\gamma = 1. Multi-user extensions (Section §15.3) connect to coded caching and the CommIT group's demand-privacy work.

Quick Check

For cache-aided PIR with N=4,K=10,γ=1/2N = 4, K = 10, \gamma = 1/2, the rate is approximately:

0.750.75 (same as γ=0\gamma = 0)

0.85\sim 0.85

1.01.0 (cache covers everything)

Cannot be computed without specifying the cache content.