Part 4: Extensions and Applications

Chapter 15: Coded Data Shuffling

Advanced~165 min

Learning Objectives

  • Define the data shuffling problem in distributed machine learning
  • State the CommIT Wan-Tuninetti-Caire result: coded shuffling reduces inter-epoch communication by factor 1+Ks1 + Ks
  • Understand the analogy: worker memory replaces cache; shuffled data replaces delivery
  • Derive the MAN-style coded shuffling scheme
  • Analyze straggler-tolerant gradient coding as a related coded-computing primitive
  • Connect coded shuffling to distributed ML system design (parameter server, all-reduce)

Sections

Prerequisites

💬 Discussion

Loading discussions...