Polynomial Codes and the Recovery Threshold
The Polynomial-Code Idea in One Sentence
Encode the blocks of and into two polynomials and such that the product , when evaluated at distinct points, yields matrix products that interpolate back to the full output . Any of the evaluations suffice.
The point is that this reuses the machinery of Shamir secret sharing (Chapter 3): a polynomial of degree is uniquely determined by evaluations. The only new idea is choosing the exponents of so that the product's coefficients are exactly the block products . Once the exponent choice is right, the rest follows from Lagrange interpolation.
Definition: Polynomial Code (Yu–Maddah-Ali–Avestimehr)
Polynomial Code (Yu–Maddah-Ali–Avestimehr)
The polynomial code with partition counts and workers over (requiring ) consists of:
- Evaluation points. Fix distinct (one per worker).
- Encoding polynomials. The exponent pattern is chosen so that the product enumerates all pairs with distinct exponents.
- Storage at worker .
- Worker computation. , where is a polynomial of degree whose coefficients are exactly the desired output blocks.
- Decoder. The master receives for any workers that have responded, and interpolates via Lagrange on the points . The coefficients of are read off as the output blocks.
Polynomial Code
A coded-matrix-multiplication scheme in which the input matrices are encoded as polynomials , and each worker receives their values at a distinct point . The product polynomial has the output blocks as its coefficients, recoverable by Lagrange interpolation of any evaluations.
Polynomial-Code Encoding
Complexity:Encoding is linear in the input size per worker. For large matrices this is dominated by the worker's local matrix- multiply, not the encoding.
Polynomial-Code Decoding via Lagrange
Complexity: scalar operations, or with FFTsThe decoder is identical to Reed-Solomon erasure decoding. For large , FFT-based Lagrange interpolation reduces the cost from to . In production the Lagrange coefficients can be pre-computed for the expected straggler patterns.
Theorem: Polynomial Code Achieves Recovery Threshold
The polynomial code with workers and partition counts over (with ) satisfies:
- Correctness. For every input and every subset with , the decoder on responses outputs the exact product .
- Storage. Each worker stores bits, for per-worker storage load .
- Recovery threshold. , matching the information-theoretic lower bound of Chapter 4, §4.2.
The scheme is optimal within the class of -storage schemes: no distributed matrix- multiplication scheme with the same per-worker storage can achieve a smaller recovery threshold.
Each worker's output is one evaluation of the degree- polynomial ; evaluations interpolate the whole polynomial, and the coefficients are exactly the desired output blocks. The matching lower bound (§5.3) comes from a cut-set argument showing any scheme storing per worker must make at least "independent contributions" available to the master.
Correctness via Lagrange
. The polynomial has degree , so any evaluations uniquely determine — Lagrange interpolation recovers the coefficients, which are exactly the output blocks .
Storage
has the same size as one block of : , i.e., . Similarly . Sum: , matching the storage budget.
Recovery threshold
The master recovers from any responses, so . The converse (§5.3) shows no scheme with the same storage can do better.
Key Takeaway
Polynomial codes achieve , optimal for coded matrix multiplication. Straggler tolerance can be set to any value between (uncoded) and (large redundancy) just by choosing . The same storage supports any — a flexibility that uncoded schemes simply do not have.
Example: Polynomial Code for ,
Work out the polynomial code for , workers with evaluation points over . Verify the recovery threshold by showing that any 4 workers suffice.
Encoding polynomials
, . , degree .
Worker storage and computation
Worker stores and . It computes .
Decoding from workers 1, 2, 3, 4
Compute Lagrange coefficients at (corresponding to ) from the 4 shares. Each coefficient is a linear combination of the 4 responses with coefficients determined by the evaluation points. The decoder succeeds because any Vandermonde submatrix over distinct points is invertible.
Straggler tolerance
With , any single straggler is absorbed. If worker 3 is slow, the master uses workers 1, 2, 4, 5 — and still recovers exactly.
Polynomial Code: Encoding and Decoding
Recovery Threshold Comparison: Polynomial vs. MDS vs. Uncoded
Plot the recovery threshold as a function of the partition count (with ) for three schemes: polynomial code (), MDS-coded replication (), and uncoded (, same as polynomial but without straggler flexibility). The polynomial code dominates in flexibility, while MDS is cheaper at (threshold) but requires more per-worker storage.
Parameters
Range of column-partition counts to plot
Number of workers (sets straggler budget)
Common Mistake: Field Size Matters
Mistake:
Use a small field (e.g., or ) with workers.
Correction:
The polynomial code requires distinct nonzero evaluation points, which are elements of . For , is needed (or a field extension ). Practical deployments use large prime fields (, etc.) or to have ample evaluation points and room for privacy padding.
Decoder Cost in Practice
Naive Lagrange decoding is per output scalar, which for a product with amounts to operations — a few seconds on a single CPU core, and comparable to the master's baseline in an uncoded scheme. For larger (say in ByzSecAgg, Chapter 11), FFT-based decoding reduces the cost to per scalar. Production implementations (Apache MXNet with coded primitives, recent PyTorch-XLA fork) include both algorithms and switch based on .
- •
Naive Lagrange: per output scalar
- •
FFT-based: — break-even around
- •
Lagrange coefficients can be pre-computed for expected straggler subsets
Quick Check
In the polynomial code with , the product polynomial has degree:
The product has degree . Recovery requires evaluations = .