Symmetric PIR (SPIR)

Two-Sided Privacy

Classical PIR (Chapter 13, §14.1, §14.2) provides one-sided privacy: the databases learn nothing about θ\theta, but the user is free to learn anything they can compute from the answers — in particular, the user might learn linear combinations of files WjW_j for jθj \neq \theta.

In settings like medical research or inter-organizational data exchange, the databases also need privacy: the user should learn WθW_\theta and only WθW_\theta. Anything else (other files, derived information) must remain hidden.

This is symmetric PIR (SPIR). The additional database-privacy requirement fundamentally changes the achievable rate: CSPIR(N,K)=11/NC_{\text{SPIR}}(N, K) = 1 - 1/N for all K2K \geq 2 (Sun-Jafar 2018c, with earlier bounds by Gertner et al. 1998).

,

Definition:

Symmetric PIR (SPIR)

A PIR scheme is symmetric (an SPIR scheme) if it satisfies both:

  1. User privacy (as in classical PIR): I ⁣(θ;Q(θ,n))  =  0n[N].I\!\left(\theta;\, Q^{(\theta, n)}\right) \;=\; 0 \quad \forall n \in [N].

  2. Database privacy: I ⁣({Wk}kθ;A(θ,1),,A(θ,N),Q(θ,1),,Q(θ,N))  =  0.I\!\left(\{W_k\}_{k \neq \theta};\, A^{(\theta, 1)}, \ldots, A^{(\theta, N)}, Q^{(\theta, 1)}, \ldots, Q^{(\theta, N)}\right) \;=\; 0. That is, the user's combined view (queries + answers) reveals nothing about the files WkW_k for kθk \neq \theta.

SPIR additionally requires shared randomness: a common-randomness symbol SS is shared across all NN databases (but unknown to the user). This randomness is essential for hiding the other files; without it, SPIR is impossible.

,

Theorem: SPIR Capacity (Sun–Jafar 2018c)

For symmetric PIR with KK files replicated across NN databases, N2N \geq 2, K2K \geq 2, with shared randomness across databases: CSPIR(N,K)  =  11N.C_{\text{SPIR}}(N, K) \;=\; 1 - \frac{1}{N}. The capacity is independent of KK (unlike classical PIR, which depends on KK).

Example: SPIR vs. Classical at N=5N = 5

Compare CPIR(5,K)C_{\text{PIR}}(5, K) with CSPIR(5,K)C_{\text{SPIR}}(5, K) for K=2,5,10,50K = 2, 5, 10, 50.

SPIR Capacity vs. Classical PIR

Plot CPIR(N,K)C_{\text{PIR}}(N, K) and CSPIR(N,K)=11/NC_{\text{SPIR}}(N, K) = 1 - 1/N on the same axes as a function of KK for fixed NN. The classical curve starts above the SPIR floor and asymptotes to it as KK \to \infty. The gap quantifies the cost of two-sided privacy in small-library settings.

Parameters
5
20

Theorem: SPIR Requires Common Randomness

SPIR with any positive rate requires the databases to share a non-trivial common randomness SS: H(S)    1N1LH(S) \;\geq\; \frac{1}{N - 1} \cdot L bits per file. Without shared randomness (H(S)=0H(S) = 0), the only achievable rate is R=0R = 0.

Classical PIR vs. SPIR — Operational Comparison

PropertyClassical PIRSPIR
User privacyYesYes
Database privacyNo (user learns derived info on other files)Yes (user learns only WθW_\theta)
Common randomness neededNoYes (H(S)L/(N1)H(S) \geq L/(N-1))
Capacity(1+1/N++1/NK1)1(1 + 1/N + \cdots + 1/N^{K-1})^{-1}11/N1 - 1/N
KK-dependenceYesNo
Operational useRead-only retrievalGenomic, medical, financial PIR
⚠️Engineering Note

Deploying SPIR

Practical guidelines for SPIR deployments:

  • Common randomness setup: requires a pre-distributed shared secret across databases. This is typically the largest operational hurdle. Methods: trusted setup phase, secret-sharing of a master key, tamper-resistant HSM.

  • Rate cost: CSPIR/CPIR=(N1)/N(1+1/N+)1C_{\text{SPIR}}/C_{\text{PIR}} = (N-1)/N \cdot (1 + 1/N + \cdots) \approx 1 for large KK. The SPIR cost is concentrated in small-library settings.

  • Use cases:

    • Genomic research (researcher learns one record; database hides everything else)
    • Encrypted DNS resolution (user learns one record; resolver hides everything else)
    • Medical record retrieval (compliance with HIPAA-style minimization)
  • Anti-use cases: SPIR is overkill when the user is trusted not to abuse derived information. Classical PIR with proper access controls suffices.

Practical Constraints
  • Shared randomness: H(S)L/(N1)H(S) \geq L/(N-1)

  • Trusted setup or HSM-distributed master key

  • Use cases: genomic, medical, financial

  • Anti-use: trusted-user contexts (use classical PIR)

📋 Ref: Sun-Jafar 2018; Wang-Skoglund 2019

Common Mistake: SPIR Doesn't Add Authentication

Mistake:

Treat SPIR as covering all "two-sided" properties — including the user authenticating that the database is honest.

Correction:

SPIR is about information theoretic privacy of the databases against the user — it bounds what the user learns, not what the user receives. The databases could still send malformed or incorrect answers (and SPIR provides no guarantee against this). For database authentication, use cryptographic signatures on the answers; for honest-user constraints, use access controls. SPIR + auth + access is the proper stack.

Key Takeaway

SPIR adds two-sided privacy with capacity 11/N1 - 1/N (independent of KK). Compared with classical PIR, the rate cost is small for large KK and notable for small KK. SPIR requires shared randomness across databases — typically via trusted setup or HSM-distributed keys. Use SPIR when both sides need privacy guarantees; use classical PIR when only the user needs privacy.

Historical Note: Origin of SPIR — Gertner et al.

Symmetric PIR was introduced by Gertner, Ishai, Kushilevitz, and Malkin in 1998 — motivated by a database-privacy goal that classical PIR did not address. Their original formulation considered SPIR with a constant number of databases and proved a Θ(1)\Theta(1) rate is achievable. The exact capacity 11/N1 - 1/N was settled by Sun and Jafar in 2018 — completing the SPIR analog of their classical PIR result. Wang and Skoglund (2019) extended SPIR to MDS-coded storage. The historical arc parallels classical PIR (Chor et al. 1995 → Sun-Jafar 2017): a 20-year gap between problem definition and capacity characterization.

Quick Check

For SPIR with N=4N = 4 databases and K=100K = 100 files, the capacity is:

0.999\approx 0.999 (close to 11)

0.750.75 (regardless of KK)

0.01\approx 0.01 (close to 1/K1/K)

Same as classical PIR