Prerequisites & Notation

Before You Begin

This chapter generalizes the coded-caching framework to distributed computing. Prerequisites: MAN basics and exposure to MapReduce and stochastic gradient descent.

  • MAN coded caching (Ch 2)(Review ch02)

    Self-check: Can you state the coded multicasting gain 1+KM/N1 + KM/N?

  • Coded data shuffling (Ch 15)(Review ch15)

    Self-check: What was the Wan-Tuninetti-Caire gain factor?

  • MapReduce programming model

    Self-check: What is the shuffle phase between map and reduce?

  • Distributed gradient descent(Review ch26)

    Self-check: What is a straggler and how does it limit SGD throughput?

  • Polynomial codes / Reed-Solomon(Review ch09)

    Self-check: Why can a degree-kk polynomial be reconstructed from k+1k+1 evaluations?

Notation for This Chapter

Symbols for coded computing in distributed systems.

SymbolMeaningIntroduced
KKNumber of workers (analog of users)s01
QQNumber of reduce output keyss01
rrComputation load: each file mapped at rr workerss01
LLCoded communication load (fraction of shuffle data)s02
ttGain parameter, analog of caching t=KM/Nt = KM/Ns02
τ\tauMean worker completion time (Exp distribution)s03
ss (gradient coding)$Straggler tolerance: tolerate any ss stragglerss03