Exercises

ex-ch16-e01

Easy

Verify the LMYA communication load formula L(r)=(1/r)(1βˆ’r/K)L(r) = (1/r)(1 - r/K) at the endpoints: r=1r = 1 (uncoded) and r=Kr = K (full replication).

ex-ch16-e02

Easy

State the analog of MAN's subpacketization constraint for coded MapReduce. Why is this a practical limitation?

ex-ch16-e03

Medium

Derive the condition under which coded MapReduce yields wallclock speedup over uncoded.

ex-ch16-e04

Medium

For gradient coding with K=20K = 20 workers and s=4s = 4 straggler tolerance, derive the storage overhead factor.

ex-ch16-e05

Medium

Show that coded matrix multiplication with threshold kk is equivalent to an (K,k)(K, k)-MDS code over the rows of the product AB\mathbf{A}\mathbf{B}.

ex-ch16-e06

Medium

For Netflix-scale analytics (K=500K = 500 workers, r=5r = 5), compute expected shuffle reduction and map overhead.

ex-ch16-e07

Hard

Prove the converse: the LMYA L(r)L(r) is optimal. (Sketch.)

ex-ch16-e08

Hard

Design a gradient coding scheme for K=5K = 5, s=2s = 2 (tolerate 2 stragglers). Specify the 5Γ—55 \times 5 encoding matrix.

ex-ch16-e09

Hard

Compare coded data shuffling (Ch 15) and coded MapReduce (Ch 16). What are their respective domains? Can they be combined?

ex-ch16-e10

Medium

Implement a toy K=6K = 6, r=2r = 2 coded MapReduce simulation in Python (pseudocode). Compare uncoded and coded shuffle messages.

ex-ch16-e11

Medium

Why does Reed-Solomon suffice for coded matrix multiplication (rather than needing more complex codes)?

ex-ch16-e12

Hard

The Joudeh-Caire 2024 result extends gradient coding to heterogeneous FL. What goes wrong when the original Tandon et al. scheme is applied directly to non-IID clients?

ex-ch16-e13

Medium

Explain why coded MapReduce's combinatorial structure is the MAN structure, and why this is so important intellectually.

ex-ch16-e14

Hard

Design experiment: suppose you're deploying coded MapReduce at a real datacenter. What metrics would you measure to validate the theoretical prediction?

ex-ch16-e15

Hard

Coded computing "unifies" caching, shuffling, MapReduce, gradient coding. Identify one open problem that cuts across all four.