References & Further Reading

References

  1. Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, Polynomial Codes: An Optimal Design for High-Dimensional Coded Matrix Multiplication, 2017. [Link]

    The headline reference for this chapter. Introduces the polynomial code, proves $K = pq$ achievable, and gives the matching lower bound. Should be read in full.

  2. K. Lee, C. Suh, and K. Ramchandran, High-Dimensional Coded Matrix Multiplication, 2017

    The earlier MDS-coded construction (recovery threshold $p + q - 1$). Important historical context for how polynomial codes improved over the prior state of the art.

  3. K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran, Speeding Up Distributed Machine Learning Using Codes, 2018

    The first paper to formally introduce coded computation for ML straggler mitigation. Compared MDS-coded matrix multiplication with uncoded baselines on EC2.

  4. S. Dutta, M. Fahim, F. Haddadpour, H. Jeong, V. Cadambe, and P. Grover, On the Optimal Recovery Threshold of Coded Matrix Multiplication, 2019

    Establishes the cut-set converse for coded matrix multiplication in full generality. Cleaner proof than the original Yu et al., used in §5.3 of this chapter.

  5. Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, Entangled Polynomial Codes for Coded Distributed Matrix Multiplication, 2020

    Generalization of polynomial codes to other operating points on the storage-recovery tradeoff. The MatDot scheme of §5.4 is a special case.

  6. W.-T. Chang and R. Tandon, On the Upload versus Download Cost for Secure and Private Matrix Multiplication, 2018

    Introduces secure (private) coded matrix multiplication. The basis for the $T$-private extension in §5.4.

  7. M. Aliasgari, O. Simeone, and J. Kliewer, Private and Secure Distributed Matrix Multiplication with Flexible Communication Load, 2020

    Detailed analysis of the $T$-private polynomial-code construction, including the explicit $K = pq + 2T$ recovery-threshold result.

  8. M. Fahim, V. R. Cadambe, and P. Grover, Numerically Stable Polynomially-Coded Computation, 2021

    Addresses the floating-point conditioning problem of polynomial codes. Source for §5.4's engineering note on numerical stability.

  9. S. Li and A. S. Avestimehr, Coded Computing: Mitigating Fundamental Bottlenecks in Large-Scale Distributed Computing and Machine Learning, 2020

    Comprehensive survey of coded computing including a full chapter on polynomial codes. Best single reference for the broader field.

  10. R. Tandon, Q. Lei, A. G. Dimakis, and N. Karampatziakis, Gradient Coding: Avoiding Stragglers in Distributed Learning, 2017. [Link]

    Forward reference: gradient coding (Chapter 6 of this book) applies the polynomial-code idea to distributed SGD. Reading this paper after Chapter 5 makes Chapter 6's constructions much more transparent.

  11. T. Jahani-Nezhad, M. A. Maddah-Ali, and G. Caire, Byzantine-Resilient Secure Aggregation for Federated Learning Based on Ramp Sharing, 2023

    CommIT-group result that combines polynomial codes, ramp secret sharing, and vector commitments. Tagged in this chapter as a forward reference; full development in Chapter 11.

Further Reading

Resources for going deeper into coded matrix multiplication and its variants.

  • Polynomial codes — full survey and connections

    Li & Avestimehr, *Coded Computing*, FnT 2020, Ch. 3

    The standard reference for polynomial codes in their full generality, with connections to lattice codes, MDS codes, and the larger coded-computing landscape.

  • MatDot and entangled polynomial codes

    Yu, Maddah-Ali, Avestimehr, *Entangled Polynomial Codes*, IEEE T-IT 2020

    Generalization of polynomial codes that achieves a family of $(K, \\mu)$ tradeoff points. Relevant for deployments outside the canonical $\\mu = 1/p + 1/q$ regime.

  • Secure and private MMM, modern survey

    Chang & Tandon, 'Privacy in Coded Computing,' IEEE Signal Processing Magazine, 2020

    Survey of privacy-preserving coded computing, useful as a bridge to Chapter 11's ByzSecAgg construction.

  • Production-ready coded-computing libraries

    Apache MXNet's `op-coded` extension; PyTorch CodedDist (research fork)

    Engineering perspective. Both implement polynomial codes with floating-point reconditioning and demonstrate the practical speedups discussed in §5.1's engineering note.