References & Further Reading

References

  1. A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications, Springer, 2nd ed., 1998

    The standard graduate reference on large deviations theory. Chapters 1--2 cover everything in Sections 20.1--20.2.

  2. T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley, 2nd ed., 2006

    Chapter 11 (method of types, Sanov, Stein) is the most accessible treatment for engineers.

  3. R. Vershynin, High-Dimensional Probability: An Introduction with Applications to Data Science, Cambridge University Press, 2018. [Link]

    Chapters 2--4 cover sub-Gaussian/sub-exponential theory and matrix concentration.

  4. S. Boucheron, G. Lugosi, and P. Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence, Oxford University Press, 2013

    Definitive modern reference on concentration inequalities. Chapters 2--3 cover Hoeffding and Bernstein.

  5. J. A. Tropp, An Introduction to Matrix Concentration Inequalities, Foundations and Trends in Machine Learning, 2015

    Survey paper on matrix Bernstein and related inequalities.

  6. W. Hoeffding, Probability Inequalities for Sums of Bounded Random Variables, 1963

    The original paper establishing Hoeffding's inequality.

  7. H. Cramér, Sur un nouveau théorème-limite de la théorie des probabilités, 1938

    Cramér's original paper on large deviations for sums of i.i.d. random variables.

  8. I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems, Cambridge University Press, 2nd ed., 2011

    The definitive treatment of the method of types and its applications to coding theory.

  9. J. A. Bucklew, Introduction to Rare Event Simulation, Springer, 2004

    Connects large deviations to importance sampling and rare-event Monte Carlo.

  10. J. Östman, G. Durisi, and G. Caire, Finite-Blocklength Bounds on the Maximum Coding Rate of Rician Fading Channels with Applications to Pilot-Assisted Transmission, 2021

    Applies concentration inequalities to derive finite-blocklength bounds for MIMO channels.

  11. G. Caire, Fundamentals of Stochastic Processes: Lecture Notes, TU Berlin, 2024

    Course material, Ch. 5: LLN, CLT, and large deviations.

  12. J. A. Gubner, Probability and Random Processes for Electrical and Computer Engineers, Cambridge University Press, 2006

    Ch. 8 covers MGFs and the Chernoff bound, which are prerequisites for this chapter.

Further Reading

These resources extend the large deviations and concentration machinery to more advanced settings encountered in research.

  • Large deviations for Markov chains and dependent sequences

    Dembo and Zeitouni (1998), Ch. 3

    The Gärtner-Ellis theorem extends Cramér's theorem to non-i.i.d. settings. Essential for analyzing queueing systems and Markov chain Monte Carlo.

  • Sanov's theorem for continuous distributions

    Dembo and Zeitouni (1998), Section 6.2

    The finite-alphabet treatment here is a preview. The full theory requires weak convergence and topology on the space of probability measures.

  • Talagrand's concentration inequality

    Boucheron, Lugosi, and Massart (2013), Ch. 6

    Goes beyond sub-Gaussian theory to concentration of functions of independent variables that are not sums — e.g., suprema of empirical processes.

  • Error exponents for channel coding

    Gallager, *Information Theory and Reliable Communication* (1968), Ch. 5

    Develops the reliability function $E(R)$ for DMCs, showing how the error exponent varies with rate. The sphere-packing and random coding exponents are the pillars of finite-blocklength analysis.

  • Matrix concentration for covariance estimation

    Tropp (2015), Section 6.3

    Directly applicable to pilot design in massive MIMO: determines how many pilots are needed to estimate the spatial covariance to a given accuracy.