References & Further Reading
References
- Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, Polynomial Codes: An Optimal Design for High-Dimensional Coded Matrix Multiplication, 2017. [Link]
The headline reference for this chapter. Introduces the polynomial code, proves $K = pq$ achievable, and gives the matching lower bound. Should be read in full.
- K. Lee, C. Suh, and K. Ramchandran, High-Dimensional Coded Matrix Multiplication, 2017
The earlier MDS-coded construction (recovery threshold $p + q - 1$). Important historical context for how polynomial codes improved over the prior state of the art.
- K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran, Speeding Up Distributed Machine Learning Using Codes, 2018
The first paper to formally introduce coded computation for ML straggler mitigation. Compared MDS-coded matrix multiplication with uncoded baselines on EC2.
- S. Dutta, M. Fahim, F. Haddadpour, H. Jeong, V. Cadambe, and P. Grover, On the Optimal Recovery Threshold of Coded Matrix Multiplication, 2019
Establishes the cut-set converse for coded matrix multiplication in full generality. Cleaner proof than the original Yu et al., used in §5.3 of this chapter.
- Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, Entangled Polynomial Codes for Coded Distributed Matrix Multiplication, 2020
Generalization of polynomial codes to other operating points on the storage-recovery tradeoff. The MatDot scheme of §5.4 is a special case.
- W.-T. Chang and R. Tandon, On the Upload versus Download Cost for Secure and Private Matrix Multiplication, 2018
Introduces secure (private) coded matrix multiplication. The basis for the $T$-private extension in §5.4.
- M. Aliasgari, O. Simeone, and J. Kliewer, Private and Secure Distributed Matrix Multiplication with Flexible Communication Load, 2020
Detailed analysis of the $T$-private polynomial-code construction, including the explicit $K = pq + 2T$ recovery-threshold result.
- M. Fahim, V. R. Cadambe, and P. Grover, Numerically Stable Polynomially-Coded Computation, 2021
Addresses the floating-point conditioning problem of polynomial codes. Source for §5.4's engineering note on numerical stability.
- S. Li and A. S. Avestimehr, Coded Computing: Mitigating Fundamental Bottlenecks in Large-Scale Distributed Computing and Machine Learning, 2020
Comprehensive survey of coded computing including a full chapter on polynomial codes. Best single reference for the broader field.
- R. Tandon, Q. Lei, A. G. Dimakis, and N. Karampatziakis, Gradient Coding: Avoiding Stragglers in Distributed Learning, 2017. [Link]
Forward reference: gradient coding (Chapter 6 of this book) applies the polynomial-code idea to distributed SGD. Reading this paper after Chapter 5 makes Chapter 6's constructions much more transparent.
- T. Jahani-Nezhad, M. A. Maddah-Ali, and G. Caire, Byzantine-Resilient Secure Aggregation for Federated Learning Based on Ramp Sharing, 2023
CommIT-group result that combines polynomial codes, ramp secret sharing, and vector commitments. Tagged in this chapter as a forward reference; full development in Chapter 11.
Further Reading
Resources for going deeper into coded matrix multiplication and its variants.
Polynomial codes — full survey and connections
Li & Avestimehr, *Coded Computing*, FnT 2020, Ch. 3
The standard reference for polynomial codes in their full generality, with connections to lattice codes, MDS codes, and the larger coded-computing landscape.
MatDot and entangled polynomial codes
Yu, Maddah-Ali, Avestimehr, *Entangled Polynomial Codes*, IEEE T-IT 2020
Generalization of polynomial codes that achieves a family of $(K, \\mu)$ tradeoff points. Relevant for deployments outside the canonical $\\mu = 1/p + 1/q$ regime.
Secure and private MMM, modern survey
Chang & Tandon, 'Privacy in Coded Computing,' IEEE Signal Processing Magazine, 2020
Survey of privacy-preserving coded computing, useful as a bridge to Chapter 11's ByzSecAgg construction.
Production-ready coded-computing libraries
Apache MXNet's `op-coded` extension; PyTorch CodedDist (research fork)
Engineering perspective. Both implement polynomial codes with floating-point reconditioning and demonstrate the practical speedups discussed in §5.1's engineering note.