References & Further Reading
References
- S. M. Ross, A First Course in Probability, Pearson, 9th ed., 2014
Chapter 8 covers probability inequalities with many worked examples. Clear exposition of Markov, Chebyshev, and one-sided bounds.
- G. R. Grimmett and D. R. Stirzaker, Probability and Random Processes, Oxford University Press, 4th ed., 2020
Section 5.6 treats inequalities including Cantelli's one-sided Chebyshev. Chapter 7 covers generating functions and the Chernoff approach.
- G. Caire, Foundations of Stochastic Processes: Lecture Notes, TU Berlin, 2024
Chapter 4, slides 150--161. The primary source for this chapter's treatment of Markov, Chebyshev, Chernoff, Hoeffding, and Bernstein inequalities.
- T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley-Interscience, 2nd ed., 2006
Section 2.6 on Jensen's inequality and its information-theoretic consequences. Chapter 11 on the method of types uses Chernoff-style bounds.
- H. Chernoff, A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations, 1952
The original 1952 paper introducing the exponential bound and the concept of the Chernoff exponent.
- W. Hoeffding, Probability inequalities for sums of bounded random variables, 1963
The foundational paper for concentration of sums of bounded independent random variables.
- S. Boucheron, G. Lugosi, and P. Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence, Oxford University Press, 2013
The modern reference for concentration inequalities. Chapters 2-3 cover sub-Gaussian variables, Hoeffding, Bernstein, and beyond.
- A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications, Springer, 2nd ed., 1998
Chapter 2 develops Cramér's theorem, connecting the Chernoff exponent to the exact large deviations rate function.
- M. Mitzenmacher and E. Upfal, Probability and Computing: Randomized Algorithms and Probabilistic Analysis, Cambridge University Press, 2nd ed., 2017
Chapter 4 gives an excellent treatment of Chernoff bounds for Poisson and binomial tails, oriented toward computer science.
- R. G. Gallager, Information Theory and Reliable Communication, Wiley, 1968
Chapter 5 develops the random coding exponent using Chernoff-type bounding.
- D. Tse and P. Viswanath, Fundamentals of Wireless Communications, Cambridge University Press, 2005
Section 5.4 discusses the capacity of fading channels and the Jensen gap between ergodic capacity and AWGN capacity.
- S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004
Chapter 3 provides the background on convex functions and Jensen's inequality used in Section 10.5.
- H. Fischer, A History of the Central Limit Theorem: From Classical to Modern Probability Theory, Springer, 2011
Historical context for Markov's and Chebyshev's contributions to the development of probability theory.
- J. L. W. V. Jensen, Sur les fonctions convexes et les inégalités entre les valeurs moyennes, 1906
Jensen's original 1906 paper on convex functions and their relationship to mean values.
- G. Caire and S. Shamai (Shitz), On the capacity of some channels with channel state information, 1999
Analyzes capacity of fading channels with various CSI assumptions. Jensen's inequality is central to the analysis.
Further Reading
For deeper study of concentration inequalities and their applications.
Modern concentration inequalities
S. Boucheron, G. Lugosi, and P. Massart, *Concentration Inequalities*, Oxford University Press 2013.
The definitive modern reference, covering entropy methods, transportation inequalities, and the Talagrand concentration phenomenon.
Large deviations theory
A. Dembo and O. Zeitouni, *Large Deviations Techniques and Applications*, 2nd ed., Springer 1998.
Places the Chernoff bound within the rigorous framework of Cramér's theorem and Sanov's theorem.
Matrix concentration inequalities
J. A. Tropp, *An Introduction to Matrix Concentration Inequalities*, Foundations and Trends in ML, 2015.
Extends scalar Chernoff/Hoeffding/Bernstein bounds to random matrices --- essential for high-dimensional statistics.
Chernoff bounds in coding theory
R. G. Gallager, *Information Theory and Reliable Communication*, Wiley 1968, Chapter 5.
Shows how the Chernoff bounding technique yields the random coding error exponent.