Moment Generating and Characteristic Functions

Transforms as the Right Language

The characteristic function (CF) provides the most elegant and general route to the multivariate Gaussian. It exists for every distribution (unlike the MGF), it characterizes the distribution uniquely, and it makes affine transformation results almost trivial to prove. Moreover, the CF of the Gaussian is itself a Gaussian in the frequency domain — a beautiful parallel to the fact that the Fourier transform of a Gaussian is a Gaussian.

Theorem: MGF and CF of the Multivariate Gaussian

Let XN(μ,Σ)\mathbf{X} \sim \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma}). The moment generating function (where it exists) is

MX(t)=E ⁣[etTX]=exp ⁣(tTμ+12tTΣt),tRn.M_{\mathbf{X}}(\mathbf{t}) = \mathbb{E}\!\left[e^{\mathbf{t}^T \mathbf{X}}\right] = \exp\!\left(\mathbf{t}^T \boldsymbol{\mu} + \frac{1}{2}\mathbf{t}^T \boldsymbol{\Sigma}\, \mathbf{t}\right), \quad \mathbf{t} \in \mathbb{R}^n.

The characteristic function is

ϕX(ω)=E ⁣[ejωTX]=exp ⁣(jωTμ12ωTΣω),ωRn.\phi_{\mathbf{X}}(\boldsymbol{\omega}) = \mathbb{E}\!\left[e^{j\boldsymbol{\omega}^T \mathbf{X}}\right] = \exp\!\left(j\boldsymbol{\omega}^T \boldsymbol{\mu} - \frac{1}{2}\boldsymbol{\omega}^T \boldsymbol{\Sigma}\, \boldsymbol{\omega}\right), \quad \boldsymbol{\omega} \in \mathbb{R}^n.

The CF is obtained from the MGF by replacing t\mathbf{t} with jωj\boldsymbol{\omega}. The quadratic form in the exponent is the Fourier-domain representation of the covariance structure.

The CF Defines the Gaussian Even When the PDF Does Not Exist

When Σ\boldsymbol{\Sigma} is singular (det(Σ)=0\det(\boldsymbol{\Sigma}) = 0), the multivariate Gaussian PDF does not exist in Rn\mathbb{R}^n. However, the characteristic function ϕ(ω)=exp(jωTμ12ωTΣω)\phi(\boldsymbol{\omega}) = \exp(j\boldsymbol{\omega}^T\boldsymbol{\mu} - \tfrac{1}{2}\boldsymbol{\omega}^T\boldsymbol{\Sigma}\,\boldsymbol{\omega}) is always well-defined for any PSD Σ\boldsymbol{\Sigma}, and it uniquely determines the distribution. This is the most general definition of the multivariate Gaussian.

Example: Sum of Independent Gaussians via MGF

Let X1,,XnX_1, \ldots, X_n be independent with XiN(μi,σi2)X_i \sim \mathcal{N}(\mu_i, \sigma_i^2). Use the MGF to find the distribution of S=i=1nXiS = \sum_{i=1}^n X_i.

Definition:

Joint Gaussianity via Characteristic Function

A random vector XRn\mathbf{X} \in \mathbb{R}^n is jointly Gaussian if its characteristic function has the form

ϕX(ω)=exp ⁣(jωTμ12ωTΣω)\phi_{\mathbf{X}}(\boldsymbol{\omega}) = \exp\!\left(j\boldsymbol{\omega}^T\boldsymbol{\mu} - \frac{1}{2}\boldsymbol{\omega}^T \boldsymbol{\Sigma}\,\boldsymbol{\omega}\right)

for some μRn\boldsymbol{\mu} \in \mathbb{R}^n and PSD matrix ΣRn×n\boldsymbol{\Sigma} \in \mathbb{R}^{n \times n}.

Equivalently, X\mathbf{X} is jointly Gaussian iff every linear combination aTX\mathbf{a}^T\mathbf{X} is a (possibly degenerate) scalar Gaussian.

Characteristic function

The function ϕX(ω)=E[ejωTX]\phi_{\mathbf{X}}(\boldsymbol{\omega}) = \mathbb{E}[e^{j\boldsymbol{\omega}^T\mathbf{X}}]. It always exists, uniquely determines the distribution, and is the Fourier transform of the PDF (when the PDF exists).

Related: Multivariate Gaussian distribution