Ferkans — Interactive Telecom Tutor

Composing PIR Extensions

Sections §14.1–§14.3 each modified one classical assumption: storage (§14.1: MDS-coded), privacy threat (§14.2: $T$ -colluding), or symmetry (§14.3: SPIR). Real systems often impose combinations of these constraints — for example, MDS-coded storage with $T$ -colluding privacy.

Composition is not free. The joint capacity is typically strictly lower than what the individual capacities would suggest. Many joint cases remain open in the information-theoretic sense — only achievable rates and outer bounds are known, not capacity.

Theorem: Coded-Storage $T$ -Colluding PIR (Freij-Hollanti et al.)

For PIR with $K$ files stored across $N$ databases via an $(N, r)$ -MDS code, with privacy required against any $T$ colluding databases (with $r + T - 1 < N$ ), an achievable PIR rate is $R_{\text{PIR-MDS-T}}(N, K, r, T) \;=\; \frac{N - r - T + 1}{N} \cdot \left(1 - \left(\frac{r + T - 1}{N}\right)^K\right)^{-1}.$ For the special cases:

$T = 1$ : recovers Tajeddine–El Rouayheb (§14.1)
$r = 1$ : recovers Sun-Jafar $T$ -colluding (§14.2)
$r = T = 1$ : recovers classical Sun-Jafar. The general capacity $C_{\text{PIR-MDS-T}}$ is open (only the achievable rate and an outer bound are known).

Proof

Achievability

The Freij-Hollanti scheme combines MDS sharing (across files) with Shamir $T$ -sharing (across queries). The combined scheme requires the storage and privacy parameters to satisfy $r + T - 1 < N$ for feasibility.

Outer bound

The cut-set converse extends naturally with $r + T - 1$ replacing the $1$ in the geometric series base. The outer bound matches the achievable rate in the asymptotic $K \to \infty$ limit ( $1 - (r + T - 1)/N$ ) but does not match for finite $K$ in general.

Capacity gap

For finite $K$ , the gap between the best known achievable rate and the best known outer bound is small but non-zero. The capacity has only been established in special cases (e.g., $K = 2$ ). Open problem: close the gap for general $K$ .

Example: Joint Capacity at $N = 8, K = 5, r = 2, T = 2$

Compute the achievable rate and compare with the individual capacities (coded-storage alone, $T$ -colluding alone).

Solution

Joint rate (Freij-Hollanti)

$r + T - 1 = 3$ , so the base is $3/8$ . $R = (8 - 3)/8 \cdot (1 - (3/8)^5)^{-1} = 0.625 \cdot (1 - 0.0074)^{-1} = 0.625 \cdot 1.0075 \approx 0.630$ .

Coded-storage alone ($r = 2, T = 1$)

Base $r/N = 2/8 = 1/4$ . $C_{\text{PIR-MDS}}(8, 5, 2) = (1 + 0.25 + 0.0625 + 0.0156 + 0.0039)^{-1} \approx 0.751$ .

$T$-colluding alone ($T = 2, r = 1$)

$C_{\text{PIR}}(8, 5, 2) = (1 + 2/8 + 4/64 + 8/512 + 16/4096)^{-1} = (1 + 0.25 + 0.0625 + 0.0156 + 0.0039)^{-1} \approx 0.751$ . (Coincides with above by symmetry.)

Joint vs. individual

The joint achievable rate $\approx 0.630$ is strictly less than the lower of the two individual capacities ( $\approx 0.751$ ). Composition costs rate.

Operational

Doubling either constraint (storage or collusion) costs $\sim 16\%$ rate. Combining both costs $\sim 30\%$ rate — non-additive but not dramatic.

Theorem: SPIR with MDS-Coded Storage (Wang–Skoglund)

For symmetric PIR with $K$ files stored across $N$ databases via an $(N, r)$ -MDS code, with shared randomness: $C_{\text{SPIR-MDS}}(N, K, r) \;=\; 1 - \frac{r}{N}.$ Independent of $K$ (as classical SPIR is) and independent of the file structure beyond the $r$ parameter.

Proof

Why $K$-independent

Same reason as classical SPIR: the database-privacy requirement decouples the rate from the number of files. The $r$ parameter modifies the achievability bound: instead of one database canceling the mask (rate $1 - 1/N$ ), $r$ databases canceling masks gives rate $1 - r/N$ .

Comparison with classical SPIR

Classical SPIR: $1 - 1/N$ . SPIR with $(N, r)$ -MDS: $1 - r/N$ . The $r$ modifier is the same as the classical-PIR-vs-MDS gap. SPIR with replicated storage ( $r = 1$ ) recovers the unmodified $1 - 1/N$ .

Joint PIR Extensions — Capacity Status

PIR Variant	Capacity Formula	Status	Reference
Classical (Sun-Jafar)	$(1 + 1/N + \cdots + 1/N^{K-1})^{-1}$	Settled	Sun-Jafar 2017
Coded-storage	$(1 + r/N + \cdots + r^{K-1}/N^{K-1})^{-1}$	Settled	Tajeddine 2018
$T$ -colluding	$(1 + T/N + \cdots + T^{K-1}/N^{K-1})^{-1}$	Settled	Sun-Jafar 2018
SPIR	$1 - 1/N$	Settled	Sun-Jafar 2018c
SPIR + Coded-storage	$1 - r/N$	Settled	Wang-Skoglund 2019
Coded-storage + $T$ -colluding	Achievable: see §14.4	OPEN for general $K$	Freij-Hollanti 2018
SPIR + $T$ -colluding	Achievable rate known	OPEN	Wang-Banawan-Ulukus 2018
Cache-aided + $T$ -colluding	Achievable rate known	OPEN	Wei et al. 2019

Open Problems in Multi-Constraint PIR

The composition of constraints in PIR remains a rich frontier. Notable open problems:

Capacity of coded-storage $T$ -colluding PIR: the Freij-Hollanti scheme is achievable, but the matching converse remains open for general $K$ . Reduces to known cases for $K = 2$ or $T = 1$ or $r = 1$ .
Capacity of SPIR + $T$ -colluding: an outer bound exists but the precise capacity is unknown. Heuristically, capacity should approach $1 - (r + T - 1)/N$ , but tight bounds are missing.
Capacity of cache-aided PIR: the WB-Ulukus bound is not tight in general. The cache-aided + $T$ -colluding case is even less well-understood. Chapter 15 returns to this.
Latency-rate trade-off: classical PIR analyzes only the one-shot rate. The latency-amortized rate (over multiple sequential queries) has different characteristics that are not yet fully understood.
Adaptive PIR: when the user can adapt queries based on previous answers, can the rate be improved? The open question is whether sequential PIR has the same capacity as non-adaptive PIR.

The CommIT group's contribution to this area (Wan-Tuninetti-Caire 2021, Chapter 15) addresses cache-aided PIR with demand privacy — pushing one of these frontiers forward.

,

Joint PIR Trade-offs: Storage, Privacy, Rate

Visualize the joint trade-off between MDS dimension $r$ (storage), collusion threshold $T$ (privacy), and PIR rate $R$ . The plot uses the Freij-Hollanti achievable rate formula. Each point in the $(r, T)$ plane corresponds to a feasible PIR configuration; the height shows the PIR rate. The infeasibility region $r + T \geq N$ is shaded.

Parameters

N

— databases10

K

— files5

⚠️Engineering Note

Engineering Perspective on Joint Extensions

Production guidance when combining PIR extensions:

Pick one constraint as primary: cross-cloud deployments typically pick coded-storage (cost reduction) and add $T = 2$ as a safety margin. Medical deployments pick SPIR (compliance) and add coded-storage as a cost reduction.
Avoid stacking aggressively: each additional constraint costs rate. Three simultaneous constraints (coded-storage + $T$ -colluding + SPIR) is often operationally untenable — the rate would be too low.
Asymptotic rates as design targets: for large $K$ , the rate approaches $1 - (r + T - 1)/N$ . Use this as a design target for the rate vs. the privacy/storage budget.
Verifiability is orthogonal: Byzantine tolerance (Chapter 11) must be added independently. Don't conflate it with $T$ -colluding.
Latency planning: $O(K)$ rounds for most variants. For latency-sensitive applications, batch multiple PIR queries.

The bottom line: pick the minimum set of constraints needed for the actual threat model, and pay the rate cost for them. Avoid over-engineering.

Practical Constraints

•
Coded + $T$ -colluding: $r + T - 1 < N$ feasibility
•
SPIR adds shared randomness requirement
•
Three+ constraints typically infeasible at useful rates
•
Byzantine tolerance: separate layer

📋 Ref: Freij-Hollanti 2018; cloud security best practices

Common Mistake: Rate Losses Don't Add

Mistake:

Compute the rate of a multi-extension PIR scheme by additively combining the rate losses from each individual extension.

Correction:

PIR rate losses combine multiplicatively in the geometric-series base, not additively. For coded-storage + $T$ -colluding, the base becomes $(r + T - 1)/N$ (not $r/N + T/N$ ). For SPIR + coded-storage, the rate is $1 - r/N$ (not $(1 - 1/N)(1 - 1/N)$ ). Use the explicit joint capacity formula (Theorem 14.4.1, etc.) for the correct rate; never interpolate by hand from the individual capacities.

Key Takeaway

Composition of PIR extensions is non-trivial and many cases remain open. Coded-storage + $T$ -colluding has an achievable rate (Freij-Hollanti) but the capacity is unknown for general $K$ . SPIR + coded-storage is settled (Wang-Skoglund: $1 - r/N$ ). Production deployments should pick the minimum constraint set and verify feasibility ( $r + T - 1 < N$ for joint coded-storage / $T$ -colluding).

Why This Matters: Looking Ahead: Side-Information PIR (Chapter 15)

Chapter 15 explores PIR with side information at the user — specifically, cache-aided PIR. When the user has cached partial library content, the active download from databases can be reduced. The cache content can be public (known to databases) or private (hidden). The CommIT group's contribution (Wan-Tuninetti-Caire 2021) addresses cache-aided PIR with demand privacy against colluding users — a setting at the intersection of coded caching (Book CC) and PIR. Section 15 is the final chapter of Part IV.

Quick Check

For PIR with $N = 6$ databases, $K = 4$ files, coded storage with $r = 2$ , and $T = 2$ collusion, the feasibility condition $r + T - 1 < N$ becomes $2 + 2 - 1 = 3 < 6$ . The Freij-Hollanti achievable rate is approximately:

$\approx 0.50$

$\approx 0.83$ (same as classical Sun-Jafar)

Zero (infeasible due to $r + T \geq N$ )

$\approx 0.10$