Entangled Polynomial Codes
The Entangled Trick
Section 8.1 showed that convolution's recovery threshold is linear in the partitions, not quadratic. The construction that achieves this is the entangled polynomial code introduced by Yu, Maddah-Ali, and Avestimehr (2020), which generalizes to the broader class of computations whose output has fewer degrees of freedom than matrix multiplication.
The point is that the standard polynomial code of Chapter 5 "wastes" degrees of freedom by encoding each output block as a separate coefficient of the product polynomial. The entangled variant assigns exponents so that related output blocks share coefficients β the algebraic identity of convolution is exploited directly in the encoding.
This section develops the entangled construction formally, proves the recovery threshold, and characterizes the storage / communication tradeoff. Section 8.3 generalizes further to the Lagrange Coded Computing framework for arbitrary multivariate polynomials.
Definition: Entangled Polynomial Code
Entangled Polynomial Code
Consider a convolution-like bilinear operation where the desired output is a vector (or tensor) whose components are bilinear in two inputs and : The entangled polynomial code with workers (over , ) consists of:
- Encoding polynomials. and .
- Worker storage. , for distinct nonzero .
- Worker computation. , evaluating the bilinear operation on the encoded inputs. This is , the product polynomial evaluated at .
- Decoder. Master collects any evaluations and interpolates the degree- polynomial via Lagrange. Its coefficients are the output blocks.
The key difference from Chapter 5's standard polynomial code: the exponents in and are both respectively β with no "separation" factor in 's exponents. The product polynomial has degree , not . The convolution structure makes this possible; general matrix multiplication cannot use this trick at the same storage.
Entangled Polynomial-Code Encoding
Complexity: linear combinations per workerContrast with the standard polynomial code (Chapter 5 Alg. 5.2): uses exponent there, and here. This small change is the entire difference between and .
Theorem: Entangled Polynomial Code:
For convolution-like bilinear operations with inputs partitioned as , the entangled polynomial code with workers over () satisfies:
- Correctness. Any worker responses recover the entire output .
- Storage. Per-worker storage is β identical to the standard polynomial code.
- Optimality. is information-theoretically tight for convolution-like computations at this storage level.
The improvement over the of general matrix multiplication is achieved without extra storage β exclusively by exploiting the reduced degree structure of the convolution output.
The convolution output has distinct coefficients (indexed by the sum ), not . The entangled encoding aligns the exponents so that each worker's response is one evaluation of a polynomial of degree . Any evaluations interpolate β matching the theoretical minimum.
The practical impact: for a convolution, the recovery threshold drops from (matrix mult) to β an improvement in straggler tolerance at the same storage.
Correctness
. Expanding using the convolution algebraic identity: , a polynomial of degree whose -th coefficient is the -th convolution output . Lagrange interpolation on evaluations recovers the polynomial exactly, and therefore all output coefficients.
Converse
The convolution output has distinct components, each requiring at least one informational-theoretic response to reconstruct. By a cut-set argument (Β§2.4 recipe), no scheme can use fewer than responses. Hence , matching achievability.
Storage
and each have the size of one block of the respective matrix. Per-worker storage is unchanged from the standard polynomial code.
Example: Entangled Code for
Construct the entangled polynomial code for a -partitioned convolution, workers. Compare with the standard polynomial code's recovery threshold.
Encoding
(degree 2). (degree 3).
Product polynomial
has degree 5 and distinct coefficients.
Recovery threshold
. With workers, straggler budget .
Comparison with standard
Standard polynomial code: . The entangled variant is more straggler- tolerant at the same storage β but only for convolution-like structured computations.
Entangled vs. Standard Polynomial-Code Frontier
Plot the storage-vs-recovery-threshold frontier for (i) standard polynomial codes ( at storage ), and (ii) entangled polynomial codes ( for convolution). Both occupy specific points in the two-dimensional tradeoff space; the entangled variant dominates only for structured outputs (convolution, matrix-vector products).
Parameters
Number of workers
Why the Output Structure Matters
The key difference between matrix multiplication and convolution is algebraic: matrix multiplication produces independent output blocks (each is a genuinely distinct sum), while convolution produces output coefficients (each is a sum over a single index ). The output structure has fewer degrees of freedom, allowing fewer responses to recover it.
More generally: any bilinear operation whose output is a degree- polynomial (in some variable) can be coded with recovery threshold. Matrix multiplication's is a "worst case" over bilinear operations; convolution, matrix-vector multiplication, and Kronecker products all fall into the "better" class.
Coded Bilinear Operations: Output Structure vs. Recovery Threshold
| Operation | Output structure | Recovery threshold | Code |
|---|---|---|---|
| Matrix mult. | independent blocks | Standard polynomial code (Ch. 5) | |
| Matrix-vector | blocks (vector output) | Polynomial code with | |
| Convolution | coefficients | Entangled polynomial code (this chapter) | |
| Tensor contraction, order 3 | Varies by contraction type | or higher | Lagrange Coded Computing (Β§8.3) |
Common Mistake: Don't Use Entangled Codes for General Matrix Multiplication
Mistake:
Apply the entangled polynomial code to general matrix multiplication expecting .
Correction:
The entangled code is tight only for bilinear operations whose output has at most distinct values (convolutions, matrix-vector, Kronecker). For general matrix multiplication, which has distinct output blocks, applying the entangled code produces an incorrect result β the blocks cannot be recovered from evaluations of a degree- polynomial. Use the standard polynomial code (Chapter 5) for general matrix multiplication.
Entangled Codes in Production
Entangled polynomial codes have seen some deployment for convolutional-heavy workloads (CNN training, image / audio signal processing in ML pipelines). The main advantage is the linear-in-partition recovery threshold. The main engineering barrier is that the encoding is tighter coupled to the operation's algebraic structure β a generic coded-computing framework must detect the operation type (matmul vs. convolution vs. tensor contraction) and select the appropriate code.
Frameworks that integrate coded computing (NVIDIA DALI, research PyTorch extensions) usually offer entangled codes for convolution specifically, and fall back to the standard polynomial code for general matmul. Users specify the operation via a tensor-operation API; the framework picks the right code.
- β’
Entangled code requires convolution-like output structure
- β’
Standard polynomial code still needed for general matmul
- β’
Framework must auto-detect operation type
Key Takeaway
Entangled polynomial codes achieve for convolution and matrix-vector products β linear in the partitions, vs. quadratic for general matrix multiplication. The gain comes from exploiting the reduced degree structure of the output polynomial. Use entangled codes when the output has convolution-like structure; stick to standard polynomial codes for general matrix multiplication.
Quick Check
Compared to the standard polynomial code for matrix multiplication with , the entangled polynomial code for convolution at the same partitions:
Uses the same per-worker storage and achieves instead of
Uses less per-worker storage but achieves the same
Uses more storage and is never worth it
Is equivalent to MatDot codes
, vs. for standard. Same storage .