Exercises

ex-cc-ch15-01

Easy

Compute the coded shuffling cost for K=100K = 100, s=0.1s = 0.1, D=1D = 1 PB.

ex-cc-ch15-02

Easy

State the coded-caching / coded-shuffling analogy.

ex-cc-ch15-03

Easy

Gradient coding with K=50K = 50 workers tolerates r=4r = 4 stragglers. What's the storage overhead per worker?

ex-cc-ch15-04

Easy

Why does the shuffling gain factor 1+Ks1+Ks saturate as KK \to \infty at fixed ss?

ex-cc-ch15-05

Easy

In coded shuffling, if s=1s = 1 (worker stores full dataset), what is the communication cost?

ex-cc-ch15-06

Medium

Converse proof sketch. Prove that the coded shuffling rate K(1s)/(1+Ks)K(1-s)/(1+Ks) is order-optimal.

ex-cc-ch15-07

Medium

Partial shuffling. If only a fraction pp of the dataset is re-shuffled per epoch (rest stays), what's the cost?

ex-cc-ch15-08

Medium

Wall-clock savings. ResNet-50 training at K=256K = 256, s=0.05s = 0.05: compute shuffling time savings per epoch and total.

ex-cc-ch15-09

Medium

Gradient coding + shuffling. Combine both techniques in a training loop. Does the combination compound?

ex-cc-ch15-10

Medium

Cross-DC training. For federated training across 10 data centers, each with 10 Gbps interconnect, dataset 1 PB: compare uncoded and coded shuffling time.

ex-cc-ch15-11

Hard

Heterogeneous worker storage. Extend coded shuffling to workers with different storage sks_k. Derive the achievable rate.

ex-cc-ch15-12

Hard

Dynamic assignments. Over TT epochs, worker assignments evolve. Under what assumption is the per-epoch rate K(1s)D/(1+Ks)K(1-s)D/ (1+Ks) achievable uniformly?

ex-cc-ch15-13

Challenge

Approximate coded shuffling for federated learning. Design a coded shuffling scheme for federated learning where each worker has local private data. Combine with differential privacy.

ex-cc-ch15-14

Challenge

Coded all-reduce. PyTorch DDP uses all-reduce for gradient aggregation. Can coded techniques speed up all-reduce?

ex-cc-ch15-15

Challenge

Industrial adoption roadmap. What are the main barriers to production adoption of coded shuffling and coded computing? Propose a roadmap.