References & Further Reading
References
- T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley-Interscience, 2nd ed., 2006
Chapter 8 covers differential entropy and Chapter 17 covers the EPI. The treatment is clear and complete, with many worked examples.
- C. E. Shannon, A Mathematical Theory of Communication, 1948
Shannon's original treatment of continuous entropy and the EPI. The EPI proof was incomplete; see Stam (1959) for the first rigorous proof.
- A. J. Stam, Some Inequalities Satisfied by the Quantities of Information of Fisher and Shannon, 1959
First rigorous proof of the EPI using Fisher information. Establishes the de Bruijn identity connecting Fisher information to entropy.
- M. H. M. Costa, A New Entropy Power Inequality, 1985
Proves concavity of entropy power along a Gaussian channel: $N(X + \sqrt{t}Z)$ is concave in $t$. This strengthening of the EPI is key for broadcast channel converses.
- S. Verdú and D. Guo, A Simple Proof of the Entropy-Power Inequality, 2006
An elegant proof of the EPI using the MMSE dimension and I-MMSE relationships. Connects the EPI to estimation theory.
- R. G. Gallager, Information Theory and Reliable Communication, Wiley, 1968
Chapter 7 covers the Gaussian channel with excellent engineering intuition. The waterfilling solution is derived cleanly via Lagrange multipliers.
- D. Guo, S. Shamai, and S. Verdú, Mutual Information and Minimum Mean-Square Error in Gaussian Channels, 2005
Establishes the fundamental I-MMSE relationship connecting mutual information to minimum mean-square error estimation.
- A. El Gamal and Y.-H. Kim, Network Information Theory, Cambridge University Press, 2011
Chapter 3 covers the Gaussian channel capacity and EPI in the context of network information theory.
- I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems, Cambridge University Press, 2nd ed., 2011
Provides the most rigorous combinatorial approach to coding theorems. The continuous extension in the appendix complements this chapter.
- Y. Polyanskiy, H. V. Poor, and S. Verdú, Channel Coding Rate in the Finite Blocklength Regime, 2010
Establishes the normal approximation for finite-blocklength capacity, showing the rate penalty scales as $\sqrt{V/n}$ where $V$ is the channel dispersion.
Further Reading
Resources for deeper exploration of continuous information measures.
Connections between information and estimation
D. Guo, S. Shamai, and S. Verdú, 'Mutual Information and Minimum Mean-Square Error in Gaussian Channels,' IEEE Trans. IT, 2005
The I-MMSE relationship $\frac{d}{d\ntn{snr}}\ntn{mi}(X; \sqrt{\ntn{snr}}X + Z) = \frac{1}{2}\text{mmse}(\ntn{snr})$ elegantly connects information theory to estimation. Used extensively in Chapters 10-12.
Entropy power inequality extensions
O. Rioul, 'Information Theoretic Proofs of Entropy Power Inequalities,' IEEE Trans. IT, 2011
A comprehensive survey of EPI proofs and extensions, including matrix versions and connections to convex geometry. Essential for the interested reader.
Rényi entropy and one-shot information theory
Book ITA, Ch. 26 — Finite-Blocklength Information Theory
Rényi entropies play a central role in non-asymptotic (finite-blocklength) information theory, where we cannot rely on the law of large numbers.
Practical quantization and rate-distortion
Book ITA, Ch. 6 — Rate-Distortion Theory
The quantization connection in Section 2.5 is the starting point for rate-distortion theory, which characterizes the optimal tradeoff between compression rate and reconstruction quality.