Prerequisites & Notation

Before You Begin

This chapter generalizes the coded-caching framework to distributed computing. Prerequisites: MAN basics and exposure to MapReduce and stochastic gradient descent.

MAN coded caching (Ch 2)(Review ch02)
Self-check: Can you state the coded multicasting gain $1 + KM/N$ ?
Coded data shuffling (Ch 15)(Review ch15)
Self-check: What was the Wan-Tuninetti-Caire gain factor?
MapReduce programming model
Self-check: What is the shuffle phase between map and reduce?
Distributed gradient descent(Review ch26)
Self-check: What is a straggler and how does it limit SGD throughput?
Polynomial codes / Reed-Solomon(Review ch09)
Self-check: Why can a degree- $k$ polynomial be reconstructed from $k+1$ evaluations?

Notation for This Chapter

Symbols for coded computing in distributed systems.

Symbol	Meaning	Introduced
$K$	Number of workers (analog of users)	s01
$Q$	Number of reduce output keys	s01
$r$	Computation load: each file mapped at $r$ workers	s01
$L$	Coded communication load (fraction of shuffle data)	s02
$t$	Gain parameter, analog of caching $t = KM/N$	s02
$\tau$	Mean worker completion time (Exp distribution)	s03
$s$ (gradient coding)$	Straggler tolerance: tolerate any $s$ stragglers	s03

← Ch 15 Coded MapReduce: Setup and Motivation