Ferkans — Interactive Telecom Tutor

From Side Information to Caches

Section §15.1 considered side info as a set of complete files. In practice, the user's cache often contains partial file content — a generic cache is some function $Z = \phi(W_1, \ldots, W_K)$ of the library, not necessarily full files.

Cache-aided PIR generalizes side-info PIR to arbitrary cache contents. The cache is populated in a placement phase before the user knows $\theta$ ; during delivery, the PIR protocol exploits whatever the cache contains.

The general capacity is open. We focus on two important regimes: (i) uncoded caching (the cache stores raw file segments), and (ii) MAN-style coded caching where multiple users share the placement design (relevant for §15.3, the CommIT contribution).

Definition:
Cache-Aided PIR Protocol

A cache-aided PIR protocol consists of:

Placement phase (offline, before user knows $\theta$ ): databases or system generates a cache content $Z$ of size $|Z| = \gamma \cdot K \cdot L$ bits, where $\gamma \in [0, 1]$ is the cache fraction. The cache is delivered to the user.
Delivery phase (online, after user picks $\theta$ ): user sends queries to databases; databases respond with answers. User combines cache and answers to decode $W_\theta$ .

Privacy: $I(\theta; Q^{(\theta, n)}) = 0$ .

Cache-aided PIR rate (delivery rate): $R(\gamma) \;=\; \frac{L}{D},$ where $D$ is the delivery-phase download. Note: this excludes the placement-phase transmission (which is amortized across queries).

Theorem: Cache-Aided PIR with Uncoded Prefetching (Wei–Banawan–Ulukus 2019)

For PIR with $K$ files, $N$ databases, and user cache $Z$ of size $\gamma K L$ bits populated via uniformly-random uncoded prefetching (each bit independently cached with probability $\gamma$ ), the delivery-phase PIR capacity is $C_{\text{cache-PIR}}(N, K, \gamma) \;=\; \sum_{i=0}^{K-1} \binom{K-1}{i} \gamma^i (1-\gamma)^{K-1-i} \cdot C_{\text{PIR}}(N, K - i)$ where $C_{\text{PIR}}(N, k) = (1 + 1/N + \cdots + 1/N^{k-1})^{-1}$ .

Operational interpretation: the capacity is a $\gamma$ -binomial mixture of the $K - M$ rates (the "effective $K$ " weighted by the probability that exactly $i$ of the $K - 1$ other files are in the cache).

Proof

Achievability

Random uncoded prefetching means each bit of each file is cached independently with probability $\gamma$ . The user knows which bits they have. The PIR protocol uses the side-info structure of §15.1 conditional on the cache realization.

Converse

The cut-set converse (Wei et al.) shows that no scheme can do better than the binomial mixture under uniform random uncoded prefetching.

Limits

$\gamma = 0$ : $C = C_{\text{PIR}}(N, K)$ (classical Sun-Jafar). $\gamma = 1$ : $C = 1$ (everything cached; no download needed). $\gamma \in (0, 1)$ : monotonically interpolates.

Example: Cache-Aided PIR at $N = 4, K = 10$

Compute the cache-aided PIR rate for $\gamma \in \{0, 0.25, 0.5, 0.75, 1\}$ .

Solution

$\gamma = 0$

$R = C_{\text{PIR}}(4, 10) = (1 + 1/4 + \cdots + 1/4^9)^{-1} \approx 0.750$ .

$\gamma = 0.25$

Binomial average over $C(4, K - i)$ weighted by Bin $(9, 0.25)$ . Computation: $\sum_{i=0}^{9} \binom{9}{i} (1/4)^i (3/4)^{9-i} \cdot C(4, 10 - i) \approx 0.78$ .

$\gamma = 0.5$

Bin $(9, 0.5)$ centered at $i \approx 4-5$ . Effective $K \approx 5$ . $R \approx 0.83$ .

$\gamma = 0.75$

Bin $(9, 0.75)$ centered at $i \approx 7$ . Effective $K \approx 3$ . $R \approx 0.95$ .

$\gamma = 1$

$R = 1$ (everything cached).

Operational

Even modest caches ( $\gamma = 0.25$ , i.e., $25\%$ cache fraction) give non-trivial rate gains. At $\gamma = 0.5$ , rate is close to $1 - 1/N$ . Cache-aided PIR is a strong engineering pattern.

Cache-Aided PIR: Cache Size vs. Rate

Plot the cache-aided PIR rate $C_{\text{cache-PIR}}(N, K, \gamma)$ as a function of the cache fraction $\gamma$ for fixed $N$ and $K$ . The curve is monotone in $\gamma$ , starting from the classical Sun-Jafar value at $\gamma = 0$ and reaching $1$ at $\gamma = 1$ . The convexity reflects the binomial structure.

Parameters

N

— databases5

K

— files10

Cache-Aided vs. Classical PIR — Operational Cases

Setting	Cache Fraction $\gamma$	PIR Rate	Use Case
Cold start	$0$	$\sim 0.75$ (classical)	First retrieval, no cache
Browser cache	$\sim 0.1$	$\sim 0.78$	Modest improvement
CDN edge cache	$\sim 0.5$	$\sim 0.83$	Significant improvement
Pinned/dedicated cache	$\sim 0.9$	$\sim 0.99$	Near-zero download cost
Cache hit	$1$ (file in cache)	$1$ (no PIR query needed)	Local response only

Definition:
Cache-Aided PIR with Unknown Caches

In this variant, the databases do not know the user's cache content $Z$ (or the placement strategy). The user must perform PIR-style queries that work for any realization of $Z$ , with privacy of $\theta$ against the databases.

The achievable rate matches the public-cache case for uniformly-random caches: $C_{\text{cache-PIR-unknown}}(N, K, \gamma) = C_{\text{cache-PIR}}(N, K, \gamma)$ . This is because the user can encode the cache realization implicitly into the query structure.

For non-uniform / structured caches (e.g., Zipf-distributed popularity), the unknown-cache variant may incur a rate cost. The exact gap is open.

Common Mistake: Cache Fraction Counted in Different Units

Mistake:

Confuse the cache fraction $\gamma$ (fraction of bits in the cache) with the side-info parameter $M$ (number of complete files in the cache).

Correction:

These are different parameterizations of the same idea. In §15.1 (side info), the user has $M$ complete files: $|Z| = M \cdot L$ , so $\gamma_{\text{equiv}} = M / K$ . In §15.2 (cache), each bit is independently cached with probability $\gamma$ : $|Z| \approx \gamma \cdot K \cdot L$ on average, but no file is stored in full. The capacity formulas differ! Side info gives $C(N, K - M)$ (effective- $K$ reduction). Cache-aided gives a binomial mixture. They coincide only at the extremes ( $\gamma = 0$ or $\gamma = 1$ or $M = 0$ or $M = K - 1$ ).

Bridge to Coded Caching (Book CC)

Cache-aided PIR borrows the placement-then- delivery structure from coded caching (Maddah-Ali & Niesen 2014; see Book CC). The key difference: in coded caching, the user's demand is public; in cache-aided PIR, the demand is private.

Coded caching uses a clever placement to enable coded delivery (broadcasting a XOR of multiple users' missing pieces). Cache-aided PIR borrows this idea but adds the privacy constraint, complicating the delivery design.

Section §15.3 explores the natural extension: cached coded delivery with multiple users and demand privacy. This is the CommIT group's contribution (Wan-Tuninetti-Caire 2021).

,

⚠️Engineering Note

Deploying Cache-Aided PIR

Practical guidelines for cache-aided PIR:

Cache placement strategy: uniformly- random caching achieves the rate $C_{\text{cache-PIR}}(N, K, \gamma)$ . Popularity-driven (Zipf) caching may improve average-case rate but loses worst-case privacy guarantees.
Cache size budget: typical browser caches: $\gamma \sim 0.01$ to $0.1$ . Edge caches: $\gamma \sim 0.5$ . Pinned caches: $\gamma \sim 0.9$ +. The delivery rate scales correspondingly.
Privacy of cache content: the user's cache content is private from the databases — this is automatic if the placement was done long ago. Live cache updates require care to prevent leak via update timing.
Cache hit handling: pre-check the cache before issuing a PIR query. Cache hit → return locally (no PIR). Cache miss → run PIR with the remaining cache as side info.
Composition with coded caching: see §15.3 for the multi-user case with coded delivery + privacy.

Practical Constraints

•
Cache fraction: $\gamma \in [0, 1]$ , typical $0.01$ – $0.9$
•
Uniform-random placement: achieves capacity formula
•
Cache hit: skip PIR, respond locally
•
Multi-user + privacy: see §15.3 (CommIT)

📋 Ref: Wei-Banawan-Ulukus 2019; CDN best practices

Key Takeaway

Cache-aided PIR uses user-side caches to reduce delivery-phase download. The capacity for uniformly-random uncoded prefetching is a $\gamma$ -binomial mixture of effective- $K$ rates: $C(N, K, \gamma) = \mathbb{E}[C(N, K-I)]$ with $I \sim \text{Bin}(K-1, \gamma)$ . The rate is monotone in $\gamma$ and reaches $1$ at $\gamma = 1$ . Multi-user extensions (Section §15.3) connect to coded caching and the CommIT group's demand-privacy work.

Quick Check

For cache-aided PIR with $N = 4, K = 10, \gamma = 1/2$ , the rate is approximately:

$0.75$ (same as $\gamma = 0$ )

$\sim 0.85$

$1.0$ (cache covers everything)

Cannot be computed without specifying the cache content.