Chapter Summary

Chapter Summary

Key Points

  • 1.

    The MGF MX(t)=E[etX]M_X(t) = \mathbb{E}[e^{tX}] encodes all moments. When it exists in a neighborhood of the origin, it determines the distribution uniquely and yields moments via differentiation: E[Xk]=MX(k)(0)\mathbb{E}[X^k] = M_X^{(k)}(0). Its key limitation is that it may not exist for heavy-tailed distributions.

  • 2.

    The CF Ο•X(u)=E[ejuX]\phi_X(u) = \mathbb{E}[e^{juX}] always exists. It is bounded by 11, uniformly continuous, Hermitian, and non-negative definite. It uniquely determines the distribution via the Fourier inversion formula. The Levy continuity theorem connects convergence of CFs to convergence in distribution.

  • 3.

    The PGF GX(s)=E[sX]G_X(s) = \mathbb{E}[s^X] is tailored to counting. For nonneg. integer-valued RVs, it encodes the PMF as Taylor coefficients. Factorial moments are obtained by evaluating derivatives at s=1s = 1. The compounding theorem GS(s)=GN(GX(s))G_S(s) = G_N(G_X(s)) handles random sums elegantly.

  • 4.

    All transforms convert convolution to multiplication. If XβŠ₯YX \perp Y, then Ο•X+Y=Ο•Xβ‹…Ο•Y\phi_{X+Y} = \phi_X \cdot \phi_Y (and similarly for MGF and PGF). This is the fundamental algebraic property that makes transform methods powerful for analyzing sums of independent RVs.

  • 5.

    The LLN and CLT follow from CF convergence. The LLN: Ο•Sn/n(u)β†’ejΞΌu\phi_{S_n/n}(u) \to e^{j\mu u}. The CLT: Ο•(Snβˆ’nΞΌ)/(Οƒn)(u)β†’eβˆ’u2/2\phi_{(S_n - n\mu)/(\sigma\sqrt{n})}(u) \to e^{-u^2/2}. Both proofs use Taylor expansion plus the Levy continuity theorem.

  • 6.

    Cumulants are additive for independent sums. The CGF mX(t)=\logMX(t)m_X(t) = \logM_X(t) generates cumulants. The Gaussian has ΞΊn=0\kappa_n = 0 for nβ‰₯3n \geq 3, so the CLT can be understood as the vanishing of higher cumulants in the normalized sum.

  • 7.

    Cramer's theorem gives the exact exponential decay rate. P(Sn>na)≐eβˆ’nmXβˆ—(a)\mathbb{P}(S_n > na) \doteq e^{-n m_X^*(a)} where mXβˆ—m_X^* is the Fenchel-Legendre transform of the CGF. The Chernoff bound provides the upper bound; the tilted distribution technique gives the matching lower bound.

  • 8.

    Branching process extinction is determined by ΞΌ=Gβ€²(1)\mu = G'(1). The extinction probability Ξ·\eta is the smallest non-negative solution of s=G(s)s = G(s). If μ≀1\mu \leq 1: Ξ·=1\eta = 1 (certain extinction). If ΞΌ>1\mu > 1: Ξ·<1\eta < 1 (survival possible).

Looking Ahead

Chapter 10 develops probability inequalities β€” Markov, Chebyshev, Chernoff, Hoeffding, and Jensen β€” that provide non-asymptotic bounds complementing the limit theorems of this chapter. Chapter 11 treats convergence modes rigorously (a.s., in probability, in LrL^r, in distribution) and proves the strong law of large numbers. Together, Chapters 9-11 form the analytical toolkit that powers all of information theory and statistical inference.