The Cumulant Generating Function

Beyond Moments: Why Cumulants?

Moments describe the "shape" of a distribution, but they interact in complicated ways for sums of random variables. The kk-th moment of a sum involves cross-terms between lower moments of the summands. Cumulants simplify this: the kk-th cumulant of a sum of independent random variables is simply the sum of the individual kk-th cumulants. This additivity makes cumulants the natural language for the CLT and for large deviations.

Definition:

Cumulant Generating Function (CGF)

The cumulant generating function (CGF) of a random variable XX is

mX(t)=logMX(t)=logE[etX],m_X(t) = \log M_X(t) = \log \mathbb{E}[e^{tX}],

defined for tt in the domain where MX(t)<M_X(t) < \infty. The nn-th cumulant κn\kappa_n is defined by

mX(t)=n=1κntnn!.m_X(t) = \sum_{n=1}^{\infty} \kappa_n \frac{t^n}{n!}.

The first few cumulants are: κ1=E[X]\kappa_1 = \mathbb{E}[X] (mean), κ2=Var(X)\kappa_2 = \text{Var}(X) (variance), κ3=E[(Xμ)3]\kappa_3 = \mathbb{E}[(X-\mu)^3] (third central moment = skewness unnormalized).

For the Gaussian distribution, mX(t)=μt+σ2t2/2m_X(t) = \mu t + \sigma^2 t^2/2, so κ1=μ\kappa_1 = \mu, κ2=σ2\kappa_2 = \sigma^2, and κn=0\kappa_n = 0 for all n3n \geq 3. The Gaussian is the unique distribution with only two nonzero cumulants.

Cumulant

The nn-th cumulant κn\kappa_n is the nn-th coefficient in the Taylor expansion of the cumulant generating function mX(t)=\logMX(t)m_X(t) = \logM_X(t). Cumulants are additive for independent random variables.

Related: Moment Generating Function (MGF), Cumulant Generating Function

Cumulant Generating Function

mX(t)=logE[etX]m_X(t) = \log \mathbb{E}[e^{tX}]. Its Taylor coefficients are the cumulants: mX(t)=n=1κntn/n!m_X(t) = \sum_{n=1}^{\infty} \kappa_n t^n/n!.

Related: Cumulant, Moment Generating Function (MGF)

Theorem: Properties of the Cumulant Generating Function

Let mX(t)=logMX(t)m_X(t) = \log M_X(t) where MX(t)<M_X(t) < \infty for t<a|t| < a. Then:

  1. mX(0)=0m_X(0) = 0.
  2. mX(0)=E[X]=μm_X'(0) = \mathbb{E}[X] = \mu.
  3. mX(0)=Var(X)m_X''(0) = \text{Var}(X).
  4. mX(t)m_X(t) is convex on its domain.
  5. If XYX \perp Y, then mX+Y(t)=mX(t)+mY(t)m_{X+Y}(t) = m_X(t) + m_Y(t).

The Gaussian Has the Simplest Cumulant Structure

For the Gaussian N(μ,σ2)\mathcal{N}(\mu, \sigma^2):

mX(t)=μt+σ22t2.m_X(t) = \mu t + \frac{\sigma^2}{2}t^2.

All cumulants κn\kappa_n with n3n \geq 3 vanish. In fact, the Gaussian is characterized by this property: it is the only distribution with a polynomial CGF. The CLT can be understood as saying that, as we sum more and more i.i.d. random variables, the higher cumulants (n3n \geq 3) become negligible compared to κ1\kappa_1 and κ2\kappa_2, so the distribution approaches Gaussian.

Example: Cumulants of the Poisson Distribution

Find all cumulants of XPoi(λ)X \sim \text{Poi}(\lambda).

Quick Check

If XX and YY are independent with CGFs mXm_X and mYm_Y, what is the third cumulant of Z=X+YZ = X + Y?

κ3(X)+κ3(Y)\kappa_3^{(X)} + \kappa_3^{(Y)}

κ3(X)κ3(Y)\kappa_3^{(X)} \cdot \kappa_3^{(Y)}

[κ3(X)]2+[κ3(Y)]2[\kappa_3^{(X)}]^2 + [\kappa_3^{(Y)}]^2

κ3(X)+κ3(Y)+3κ1(X)κ2(Y)+3κ2(X)κ1(Y)\kappa_3^{(X)} + \kappa_3^{(Y)} + 3\kappa_1^{(X)}\kappa_2^{(Y)} + 3\kappa_2^{(X)}\kappa_1^{(Y)}