Comparisons and Privacy Extension
Three Schemes, Three Storage-Threshold Points
The polynomial code of Section 5.2 is optimal at its specific storage level, but the full -tradeoff has other operating points that serve other system constraints. This section briefly compares the three standard schemes β uncoded replication, MDS-coded replication, and polynomial codes β and then extends polynomial codes to the -private setting, where we demand that no coalition of workers learns anything about or .
The -private extension is small from an algebraic standpoint (add random matrices to the encoding polynomial) but carries the full weight of the privacy-robustness-efficiency tradeoff: every unit of privacy costs one extra worker response. This is the golden thread, now instantiated in concrete code.
Three Schemes for Distributed Matrix Multiplication
| Scheme | Per-worker storage | Recovery threshold | Decoder complexity | Best for |
|---|---|---|---|---|
| Uncoded replication (-replicas of each ) | (one copy per block) | per block | Small clusters, low redundancy | |
| MDS-coded replication (row/column codes) | Reed-Solomon | Balanced storage / threshold | ||
| Polynomial code (this chapter) | (matches lower bound) | Lagrange, or FFT | Maximum straggler tolerance at this storage | |
| Entangled polynomial code (MatDot) | Lagrange | Minimum recovery threshold β at higher storage cost |
When Each Scheme Wins
-
Uncoded replication wins when the system has ample replicas per block and the tail-latency budget is generous. Production ML pipelines at FAANG-scale sometimes use naive replication because its implementation is trivial and the tail-latency overhead is absorbed by parallel workloads.
-
MDS-coded replication wins when recovery-threshold matters more than per-worker storage. For matrix-vector products (where ), MDS is the canonical choice because equals the polynomial-code bound.
-
Polynomial codes win when straggler tolerance matters and per-worker storage is tight. The EC2 benchmark shows β speedup over uncoded at the same storage level, making this the default choice in modern coded-computing pipelines.
-
Entangled polynomial codes (MatDot) achieve the minimum recovery threshold by relaxing the storage- optimality constraint. They live at a different point on the full tradeoff curve and are the right choice when the straggler pattern is extreme (e.g., 50% stragglers).
Definition: -Private Polynomial Code
-Private Polynomial Code
A -private polynomial code for matrix multiplication guarantees that any set of workers β even if they collude β learns no information (in the Shannon sense) about or .
The construction extends Β§5.2's polynomial code by appending random matrices to each encoding polynomial: where are i.i.d. uniform random matrices over . The product now has degree ; its coefficients include the desired plus "random" cross terms.
The random terms serve as "one-time pad" against any size- coalition, just like in Shamir secret sharing (Chapter 3). The cost is extra worker responses needed to interpolate the product polynomial.
Theorem: -Private Recovery Threshold
The -private polynomial code with partition counts and workers over (with ) achieves:
- -privacy. For any subset with , .
- Correctness. Any worker responses uniquely determine and hence .
- Storage. (with the -padding included).
The extra random terms in each encoding polynomial act as information-theoretic masks β a size- coalition can solve for all of them simultaneously but gains no information about the secrets. The overhead is additional responses at the master (to interpolate the longer product polynomial). Compared to the public code's , the private version pays a linear cost in the privacy parameter.
Privacy argument
A size- coalition observes evaluations of and . Each polynomial has (resp. ) "coefficients"; the observations are a Vandermonde-full- rank linear system in plus random matrices. The randomness absorbs exactly degrees of freedom, making the observations look uniform over the masking space β independent of .
Correctness argument
has degree . Any distinct evaluations interpolate the polynomial uniquely, and 's coefficients in positions are exactly the desired output blocks .
Straggler tolerance
The scheme tolerates up to stragglers, i.e., a more expensive budget than the non-private version. Setting recovers the standard polynomial code's .
Example: -Private Code for ,
Construct a -private polynomial code for with workers. How many stragglers can be tolerated?
Encoding polynomials
(one random term, ). (shift indices to avoid collision with ).
Product polynomial
has degree . Coefficients include (the four desired blocks) plus cross terms with (random, not needed).
Recovery
Master needs evaluations for interpolation... wait β that's not right. Let's recount: , so evaluations suffice. With , straggler budget is in addition to -privacy.
Operational
One extra worker (compared to non-private) in exchange for perfect privacy against any single worker. This is the canonical tradeoff used in Chapter 11's ByzSecAgg, where ramp secret sharing lets us amortize this cost across many gradient coordinates.
Privacy Cost: vs. in Private Polynomial Codes
For fixed partitions, plot the recovery threshold as a function of the privacy parameter . Each unit of costs extra workers β a linear privacy-efficiency tradeoff. The plot also shows the minimum required to host the scheme (which must be at least ).
Parameters
Key Takeaway
Adding -privacy costs extra worker responses. The polynomial-code framework extends cleanly: the master's recovery threshold goes from to . This is the precise quantification of the privacy-efficiency tradeoff β each additional colluder the scheme tolerates costs two extra workers. The linearity is information- theoretic, not a design accident.
Common Mistake: Extended Field Size for Privacy
Mistake:
Apply a -private code over with workers and expect perfect privacy.
Correction:
The privacy guarantee holds information-theoretically over any field, but the recovery threshold requires distinct evaluation points. For , the field must have at least nonzero elements, i.e., . With there are only nonzero elements β insufficient. Use or an extension .
ByzSecAgg Builds on This Construction
The -private polynomial-code extension of this section is the technical starting point for ByzSecAgg (Chapter 11), one of the CommIT group's signature contributions to this book. ByzSecAgg combines (i) ramp secret sharing (Chapter 3) for privacy-efficient sharing of long gradient vectors, (ii) a -private polynomial-code-like scheme for distance-based Byzantine detection, and (iii) vector commitments for integrity. The construction achieves communication complexity for users, -dimensional gradients, and Byzantine workers β a substantial improvement over the of prior secure-aggregation schemes.
Readers who want the full deployment of the -private polynomial code should work through Chapter 11; this commit_contribution block is a forward pointer, not a full treatment.
Numerical Stability in Floating-Point Polynomial Codes
While the mathematical construction is over , production deployments often apply the same polynomial-code machinery over the reals / floats for speed. The issue is that Vandermonde matrices at integer evaluation points are notoriously ill-conditioned in floating-point β condition numbers grow exponentially with . Recent work (Fahim et al. 2021, Das et al. 2023) has proposed Chebyshev-style evaluation points and other reconditioning tricks to stabilize the decoder. The takeaway: if you implement polynomial codes in practice, do not use ; use Chebyshev roots , or stay in an exact integer field. Ignoring this costs accuracy and can silently corrupt outputs.
- β’
Integer evaluation points give condition number β unusable for
- β’
Chebyshev points yield condition number β much better but still bad for
- β’
Exact finite-field arithmetic avoids all conditioning issues
Historical Note: From Polynomial Codes to a Whole Sub-Field
2017βpresentPolynomial codes (Yu et al. 2017) gave the coded-computing community its first clean, optimal, deterministic construction. The follow-on literature is substantial: MatDot codes (Dutta-Fahim-Cadambe 2019) for minimum recovery threshold; Entangled polynomial codes (Yu-Maddah-Ali-Avestimehr 2020) for the tradeoff frontier between polynomial and MatDot; and the -private extensions (Chen-Tandon 2018, Yang-Jafar 2019) that power the secure-aggregation work of Part III.
A decade later, coded computing is a recognized sub-field of information theory with its own dedicated journal sections and workshops. The Li / Avestimehr 2020 survey is the standard reference; Chapter 11's ByzSecAgg result is a CommIT-group contribution extending the line.
Quick Check
A -private polynomial code for matrix multiplication has what recovery threshold ?
. Each privacy unit costs extra worker responses.