Chapter Summary

Chapter Summary

Key Points

  • 1.

    Coded MapReduce (LMYA 2018) extends MAN coded caching to distributed analytics. With per-file replication rr, shuffle communication drops by factor rr β€” directly analogous to the caching gain 1+t1 + t.

  • 2.

    Memory-communication-computation tradeoff. Extra compute (redundant map) substitutes for shuffle bandwidth. Optimal rβˆ—=Tshuffle/Tmapr^* = \sqrt{T_\text{shuffle}/T_\text{map}} β€” non-trivial in compute-heavy workloads.

  • 3.

    Gradient coding (Tandon-Lei-Dimakis-Karampatziakis 2017) tolerates ss stragglers at storage overhead s+1s+1. Master decodes from any Kβˆ’sK - s workers.

  • 4.

    Coded matrix multiplication (Lee-Lam-Pedarsani-Papailiopoulos- Ramchandran 2017) uses Reed-Solomon / polynomial codes for straggler-tolerant matmul.

  • 5.

    Distinct primitives for distinct goals. Coded MapReduce saves bandwidth; gradient coding saves latency. They compose but are not interchangeable.

  • 6.

    CommIT contribution: Joudeh-Caire (2024) β€” coded gradient methods for federated learning, handling heterogeneous client data and integrating with privacy mechanisms.

  • 7.

    Unified coded-computing framework. Coded caching, coded shuffling, coded MapReduce, gradient coding, coded matmul β€” all share the memory/storage β‡Œ communication tradeoff, with different metrics and different combinatorial encodings.

Looking Ahead

Chapter 17 covers coded caching with secure delivery β€” extending the privacy ideas of Chapter 12 to defend against external eavesdroppers. Chapter 18 covers multi-access coded caching, where each user can access multiple caches. These extensions complete the survey of Part IV's coded-caching frontiers before moving to open research problems in Part V.