Prerequisites & Notation
Before You Begin
This capstone chapter synthesizes material from across the book and points toward active research frontiers. Brush up on the following before starting.
- Convex optimization and duality (used throughout, especially in online learning)(Review ch01)
Self-check: Can you state the KKT conditions and explain why convex problems have no duality gap under Slater's condition?
- Bayesian MMSE, LMMSE, and posterior computation(Review ch16)
Self-check: Can you write the LMMSE estimator from memory and state when it is optimal?
- Kalman filtering: state-space model and recursive update(Review ch23)
Self-check: Can you write the Kalman gain and state the innovation equation?
- Cramér–Rao bound, Bayesian CRB, and the threshold effect(Review ch24)
Self-check: Can you explain why the CRB is loose in the low-SNR regime?
- Eigenvalues of graph Laplacians, spectral graph theory basics
Self-check: Do you know why the second-smallest Laplacian eigenvalue (algebraic connectivity) controls mixing rates?
- PAC learning and concentration inequalities (Hoeffding, Azuma)
Self-check: Can you state Hoeffding's inequality and explain what a sub-Gaussian tail is?
Notation for This Chapter
Symbols introduced in this chapter. The notation bridges statistics, online learning, and distributed computation.
| Symbol | Meaning | Introduced |
|---|---|---|
| Cumulative regret over horizon : difference between learner's loss and best fixed expert's loss | s02 | |
| Convex loss at round evaluated at iterate | s02 | |
| Learning rate (step size) in online gradient descent / MWU | s02 | |
| Weight on expert at round in MWU | s02 | |
| Number of experts / arms (online learning) or nodes (graphs) | s02 | |
| Number of bandit arms | s02 | |
| Doubly stochastic gossip/consensus matrix | s03 | |
| Second-largest eigenvalue of in magnitude; governs consensus rate | s03 | |
| Undirected graph with node set and edge set | s03 | |
| Graph Laplacian | s03 | |
| Statistical and computational thresholds in a high-dimensional estimation problem | s01 | |
| Sample size, ambient dimension, sparsity level | s01 | |
| Estimate at node , iteration | s03 | |
| Differential privacy parameter (small = strong privacy) | s03 |