Chapter Summary

Key Points

1.
The MGF $M_X(t) = \mathbb{E}[e^{tX}]$ encodes all moments. When it exists in a neighborhood of the origin, it determines the distribution uniquely and yields moments via differentiation: $\mathbb{E}[X^k] = M_X^{(k)}(0)$ . Its key limitation is that it may not exist for heavy-tailed distributions.
2.
The CF $\phi_X(u) = \mathbb{E}[e^{juX}]$ always exists. It is bounded by $1$ , uniformly continuous, Hermitian, and non-negative definite. It uniquely determines the distribution via the Fourier inversion formula. The Levy continuity theorem connects convergence of CFs to convergence in distribution.
3.
The PGF $G_X(s) = \mathbb{E}[s^X]$ is tailored to counting. For nonneg. integer-valued RVs, it encodes the PMF as Taylor coefficients. Factorial moments are obtained by evaluating derivatives at $s = 1$ . The compounding theorem $G_S(s) = G_N(G_X(s))$ handles random sums elegantly.
4.
All transforms convert convolution to multiplication. If $X \perp Y$ , then $\phi_{X+Y} = \phi_X \cdot \phi_Y$ (and similarly for MGF and PGF). This is the fundamental algebraic property that makes transform methods powerful for analyzing sums of independent RVs.
5.
The LLN and CLT follow from CF convergence. The LLN: $\phi_{S_n/n}(u) \to e^{j\mu u}$ . The CLT: $\phi_{(S_n - n\mu)/(\sigma\sqrt{n})}(u) \to e^{-u^2/2}$ . Both proofs use Taylor expansion plus the Levy continuity theorem.
6.
Cumulants are additive for independent sums. The CGF $m_X(t) = \logM_X(t)$ generates cumulants. The Gaussian has $\kappa_n = 0$ for $n \geq 3$ , so the CLT can be understood as the vanishing of higher cumulants in the normalized sum.
7.
Cramer's theorem gives the exact exponential decay rate. $\mathbb{P}(S_n > na) \doteq e^{-n m_X^*(a)}$ where $m_X^*$ is the Fenchel-Legendre transform of the CGF. The Chernoff bound provides the upper bound; the tilted distribution technique gives the matching lower bound.
8.
Branching process extinction is determined by $\mu = G'(1)$ . The extinction probability $\eta$ is the smallest non-negative solution of $s = G(s)$ . If $\mu \leq 1$ : $\eta = 1$ (certain extinction). If $\mu > 1$ : $\eta < 1$ (survival possible).

Looking Ahead

Chapter 10 develops probability inequalities — Markov, Chebyshev, Chernoff, Hoeffding, and Jensen — that provide non-asymptotic bounds complementing the limit theorems of this chapter. Chapter 11 treats convergence modes rigorously (a.s., in probability, in $L^r$ , in distribution) and proves the strong law of large numbers. Together, Chapters 9-11 form the analytical toolkit that powers all of information theory and statistical inference.

Branching Processes and Compound Distributions Exercises