References & Further Reading

References

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, Communication-Efficient Learning of Deep Networks from Decentralized Data, 2017. [Link]
The headline paper introducing federated learning and the FedAvg algorithm. Foundational reference for this chapter.
X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang, On the Convergence of FedAvg on Non-IID Data, 2020. [Link]
Rigorous convergence analysis of FedAvg on both IID and non-IID data. Main reference for §9.2's theorems.
P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, and others, Advances and Open Problems in Federated Learning, 2021
The definitive survey of federated learning. Essential background reading for Parts III–V of this book.
K. Bonawitz, H. Eichner, W. Grieskamp, and others, Towards Federated Learning at Scale: System Design, 2019. [Link]
Google's engineering perspective on production federated learning. Describes the Gboard deployment and the constraints that shape algorithmic choices.
J. Konečný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon, Federated Learning: Strategies for Improving Communication Efficiency, 2016. [Link]
Pre-FedAvg paper introducing quantization, structured updates, and random masking for communication-efficient FL. Primary reference for §9.3.
D. Alistarh, D. Grubic, J. Li, R. Tomioka, and M. Vojnovic, QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding, 2017. [Link]
Rigorous convergence analysis of stochastic gradient quantization. Basis for §9.3's quantization theorem.
S. U. Stich, J.-B. Cordonnier, and M. Jaggi, Sparsified SGD with Memory, 2018. [Link]
Top-$K$ sparsification with error feedback. Proves convergence-preserving compression. Main reference for §9.3's sparsification treatment.
L. Zhu, Z. Liu, and S. Han, Deep Leakage from Gradients, 2019. [Link]
The landmark gradient-inversion paper. Should be read in full before trusting any "FL is private by design" claim. Primary reference for §9.4.
H. Yin, A. Mallya, A. Vahdat, J. M. Alvarez, J. Kautz, and P. Molchanov, See through Gradients: Image Batch Recovery via GradInversion, 2021. [Link]
Extends gradient inversion to batch sizes up to 48 on ImageNet. Shows that small batches do not protect privacy.
K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, Practical Secure Aggregation for Privacy-Preserving Machine Learning, 2017
Forward reference: the secure-aggregation protocol that Chapter 10 develops. First read of this paper shapes how one thinks about FL privacy.
T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith, Federated Optimization in Heterogeneous Networks, 2020. [Link]
FedProx — a generalization of FedAvg that handles non-IID data more robustly. Addresses the client drift issue of §9.2.
S. P. Karimireddy, S. Kale, M. Mohri, S. J. Reddi, S. U. Stich, and A. T. Suresh, SCAFFOLD: Stochastic Controlled Averaging for Federated Learning, 2020. [Link]
Variance-reduced FL algorithm that mitigates client drift. State-of-the-art for convergence on non-IID data.