Comparisons and Privacy Extension

Three Schemes, Three Storage-Threshold Points

The polynomial code of Section 5.2 is optimal at its specific storage level, but the full (K,ΞΌ)(K, \mu)-tradeoff has other operating points that serve other system constraints. This section briefly compares the three standard schemes β€” uncoded replication, MDS-coded replication, and polynomial codes β€” and then extends polynomial codes to the TT-private setting, where we demand that no coalition of TT workers learns anything about A\mathbf{A} or B\mathbf{B}.

The TT-private extension is small from an algebraic standpoint (add TT random matrices to the encoding polynomial) but carries the full weight of the privacy-robustness-efficiency tradeoff: every unit of privacy costs one extra worker response. This is the golden thread, now instantiated in concrete code.

Three Schemes for Distributed Matrix Multiplication

SchemePer-worker storage ΞΌ\muRecovery threshold KKDecoder complexityBest for
Uncoded replication (rr-replicas of each (i,j)(i,j))1/(pq)1/(pq)pqpq (one copy per block)O(1)O(1) per blockSmall clusters, low redundancy
MDS-coded replication (row/column codes)1/min⁑(p,q)1/\min(p, q)p+qβˆ’1p + q - 1O((p+q)2)O((p+q)^2) Reed-SolomonBalanced storage / threshold
Polynomial code (this chapter)1/p+1/q1/p + 1/qpqpq (matches lower bound)O((pq)2)O((pq)^2) Lagrange, or O(pqlog⁑2pq)O(pq \log^2 pq) FFTMaximum straggler tolerance at this storage
Entangled polynomial code (MatDot)1/p+1/q1/p + 1/qp+qβˆ’1p + q - 1O(pq)O(pq) LagrangeMinimum recovery threshold β€” at higher storage cost

When Each Scheme Wins

  • Uncoded replication wins when the system has ample replicas per block and the tail-latency budget is generous. Production ML pipelines at FAANG-scale sometimes use naive replication because its implementation is trivial and the tail-latency overhead is absorbed by parallel workloads.

  • MDS-coded replication wins when recovery-threshold KK matters more than per-worker storage. For matrix-vector products (where q=1q = 1), MDS is the canonical choice because p+qβˆ’1=pp + q - 1 = p equals the polynomial-code bound.

  • Polynomial codes win when straggler tolerance matters and per-worker storage is tight. The EC2 benchmark shows 33–7Γ—7\times speedup over uncoded at the same storage level, making this the default choice in modern coded-computing pipelines.

  • Entangled polynomial codes (MatDot) achieve the minimum recovery threshold p+qβˆ’1p + q - 1 by relaxing the storage- optimality constraint. They live at a different point on the full tradeoff curve and are the right choice when the straggler pattern is extreme (e.g., 50% stragglers).

Definition:

TT-Private Polynomial Code

A TT-private polynomial code for matrix multiplication guarantees that any set of TT workers β€” even if they collude β€” learns no information (in the Shannon sense) about A\mathbf{A} or B\mathbf{B}.

The construction extends Β§5.2's polynomial code by appending TT random matrices to each encoding polynomial: pA(x)β€…β€Š=β€…β€Šβˆ‘i=0pβˆ’1Ai+1 xiβ€…β€Š+β€…β€Šβˆ‘β„“=pp+Tβˆ’1ZA,ℓ xβ„“,p_A(x) \;=\; \sum_{i=0}^{p-1} \mathbf{A}_{i+1}\, x^i \;+\; \sum_{\ell = p}^{p + T - 1} \mathbf{Z}_{A, \ell} \, x^\ell, pB(x)β€…β€Š=β€…β€Šβˆ‘j=0qβˆ’1Bj+1 xpj+Tβ€…β€Š+β€…β€Šβˆ‘β„“=pq+Tpq+2Tβˆ’1ZB,ℓ xβ„“,p_B(x) \;=\; \sum_{j=0}^{q-1} \mathbf{B}_{j+1}\, x^{pj + T} \;+\; \sum_{\ell = pq + T}^{pq + 2T - 1} \mathbf{Z}_{B, \ell} \, x^\ell, where ZA,β„“,ZB,β„“\mathbf{Z}_{A, \ell}, \mathbf{Z}_{B, \ell} are i.i.d. uniform random matrices over Fq\mathbb{F}_q. The product pC(x)=pA(x)TpB(x)p_C(x) = p_A(x)^T p_B(x) now has degree pq+2Tβˆ’1pq + 2T - 1; its coefficients include the desired Cij\mathbf{C}_{ij} plus 2T+2Tp2T + 2Tp "random" cross terms.

The random terms serve as "one-time pad" against any size-TT coalition, just like in Shamir secret sharing (Chapter 3). The cost is 2T2T extra worker responses needed to interpolate the product polynomial.

,

Theorem: TT-Private Recovery Threshold

The TT-private polynomial code with partition counts (p,q)(p, q) and NN workers over Fq\mathbb{F}_q (with qβ‰₯N+2Tq \geq N + 2T) achieves:

  1. TT-privacy. For any subset UβŠ†[N]\mathcal{U} \subseteq [N] with ∣Uβˆ£β‰€T|\mathcal{U}| \leq T, I(A,B;{A~k,B~k}k∈U)=0I(\mathbf{A}, \mathbf{B}; \{\tilde{\mathbf{A}}_k, \tilde{\mathbf{B}}_k\}_{k \in \mathcal{U}}) = 0.
  2. Correctness. Any K=pq+2TK = pq + 2T worker responses uniquely determine pCp_C and hence ATB\mathbf{A}^T \mathbf{B}.
  3. Storage. ΞΌ=(p+T)/p+(q+T)/qβ‹…(1/pq)=1/p+1/q+T/p+T/q\mu = (p + T)/p + (q + T)/q \cdot (1/pq) = 1/p + 1/q + T/p + T/q (with the TT-padding included).

The extra TT random terms in each encoding polynomial act as information-theoretic masks β€” a size-TT coalition can solve for all of them simultaneously but gains no information about the secrets. The overhead is 2T2T additional responses at the master (to interpolate the longer product polynomial). Compared to the public code's K=pqK = pq, the private version pays a linear cost in the privacy parameter.

Example: 11-Private Code for p=q=2p = q = 2, N=7N = 7

Construct a 11-private polynomial code for p=q=2p = q = 2 with N=7N = 7 workers. How many stragglers can be tolerated?

Privacy Cost: KK vs. TT in Private Polynomial Codes

For fixed (p,q)=(4,4)(p, q) = (4, 4) partitions, plot the recovery threshold K=pq+2TK = pq + 2T as a function of the privacy parameter TT. Each unit of TT costs 22 extra workers β€” a linear privacy-efficiency tradeoff. The plot also shows the minimum NN required to host the scheme (which must be at least KK).

Parameters
4
4
8

Key Takeaway

Adding TT-privacy costs 2T2T extra worker responses. The polynomial-code framework extends cleanly: the master's recovery threshold goes from pqpq to pq+2Tpq + 2T. This is the precise quantification of the privacy-efficiency tradeoff β€” each additional colluder the scheme tolerates costs two extra workers. The linearity is information- theoretic, not a design accident.

Common Mistake: Extended Field Size for Privacy

Mistake:

Apply a 11-private code over F16\mathbb{F}_{16} with N=15N = 15 workers and expect perfect privacy.

Correction:

The privacy guarantee holds information-theoretically over any field, but the recovery threshold K=pq+2TK = pq + 2T requires qβ‰₯N+2Tq \geq N + 2T distinct evaluation points. For N=15,T=1N = 15, T = 1, the field must have at least 1717 nonzero elements, i.e., qβ‰₯17q \geq 17. With F16\mathbb{F}_{16} there are only 1515 nonzero elements β€” insufficient. Use F17\mathbb{F}_{17} or an extension F25=F32\mathbb{F}_{2^5} = \mathbb{F}_{32}.

πŸŽ“CommIT Contribution(2023)

ByzSecAgg Builds on This Construction

T. Jahani-Nezhad, M. A. Maddah-Ali, G. Caire β€” IEEE Journal on Selected Areas in Information Theory

The TT-private polynomial-code extension of this section is the technical starting point for ByzSecAgg (Chapter 11), one of the CommIT group's signature contributions to this book. ByzSecAgg combines (i) ramp secret sharing (Chapter 3) for privacy-efficient sharing of long gradient vectors, (ii) a TT-private polynomial-code-like scheme for distance-based Byzantine detection, and (iii) vector commitments for integrity. The construction achieves communication complexity O(nlog⁑n+Bd)O(n \log n + B d) for nn users, dd-dimensional gradients, and BB Byzantine workers β€” a substantial improvement over the O(n2)O(n^2) of prior secure-aggregation schemes.

Readers who want the full deployment of the TT-private polynomial code should work through Chapter 11; this commit_contribution block is a forward pointer, not a full treatment.

byzsecaggpolynomial-codest-privateforward-referenceView Paper β†’
🚨Critical Engineering Note

Numerical Stability in Floating-Point Polynomial Codes

While the mathematical construction is over Fq\mathbb{F}_q, production deployments often apply the same polynomial-code machinery over the reals / floats for speed. The issue is that Vandermonde matrices at integer evaluation points are notoriously ill-conditioned in floating-point β€” condition numbers grow exponentially with KK. Recent work (Fahim et al. 2021, Das et al. 2023) has proposed Chebyshev-style evaluation points and other reconditioning tricks to stabilize the decoder. The takeaway: if you implement polynomial codes in practice, do not use Ξ±k=k\alpha_k = k; use Chebyshev roots Ξ±k=cos⁑((2kβˆ’1)Ο€/(2K))\alpha_k = \cos((2k - 1)\pi / (2K)), or stay in an exact integer field. Ignoring this costs accuracy and can silently corrupt outputs.

Practical Constraints
  • β€’

    Integer evaluation points give condition number ∼KK\sim K^K β€” unusable for K>20K > 20

  • β€’

    Chebyshev points yield condition number ∼2K\sim 2^K β€” much better but still bad for K>40K > 40

  • β€’

    Exact finite-field arithmetic avoids all conditioning issues

πŸ“‹ Ref: Fahim, Cadambe, Grover 2021 IEEE T-IT

Historical Note: From Polynomial Codes to a Whole Sub-Field

2017–present

Polynomial codes (Yu et al. 2017) gave the coded-computing community its first clean, optimal, deterministic construction. The follow-on literature is substantial: MatDot codes (Dutta-Fahim-Cadambe 2019) for minimum recovery threshold; Entangled polynomial codes (Yu-Maddah-Ali-Avestimehr 2020) for the tradeoff frontier between polynomial and MatDot; and the TT-private extensions (Chen-Tandon 2018, Yang-Jafar 2019) that power the secure-aggregation work of Part III.

A decade later, coded computing is a recognized sub-field of information theory with its own dedicated journal sections and workshops. The Li / Avestimehr 2020 survey is the standard reference; Chapter 11's ByzSecAgg result is a CommIT-group contribution extending the line.

, ,

Quick Check

A 33-private polynomial code for p=3,q=4p = 3, q = 4 matrix multiplication has what recovery threshold KK?

K=pq+2T=18K = pq + 2T = 18

K=pq+T=15K = pq + T = 15

K=pq=12K = pq = 12

K=2pq=24K = 2pq = 24