Chi-Squared, Wishart, and Related Distributions

What Happens When You Square and Multiply Gaussians

Many quantities in statistics and engineering are quadratic functions of Gaussian vectors: the sample variance, the SNR, the test statistic in hypothesis testing, the eigenvalues of sample covariance matrices. This section develops the distributions that arise from such quadratic operations: the chi-squared for sums of squared Gaussians, the non-central chi-squared when the Gaussians have non-zero means, and the Wishart distribution for the sample covariance matrix.

Definition:

Chi-Squared Distribution

Let Z1,,ZnZ_1, \ldots, Z_n be i.i.d. N(0,1)\mathcal{N}(0, 1). The random variable

Q=i=1nZi2Q = \sum_{i=1}^n Z_i^2

has the chi-squared distribution with nn degrees of freedom, written Qχn2Q \sim \chi^2_n. Its PDF is

fQ(q)=qn/21eq/22n/2Γ(n/2),q>0.f_Q(q) = \frac{q^{n/2 - 1} e^{-q/2}}{2^{n/2}\,\Gamma(n/2)}, \quad q > 0.

The mean is E[Q]=n\mathbb{E}[Q] = n and the variance is Var(Q)=2n\text{Var}(Q) = 2n.

The chi-squared with nn degrees of freedom is a Gamma distribution with shape n/2n/2 and rate 1/21/2: χn2=Gamma(n/2,1/2)\chi^2_n = \text{Gamma}(n/2, 1/2).

Definition:

Non-Central Chi-Squared Distribution

Let ZiN(μi,1)Z_i \sim \mathcal{N}(\mu_i, 1) be independent. Then

Q=i=1nZi2χn2(δ),Q = \sum_{i=1}^n Z_i^2 \sim \chi^2_n(\delta),

where δ=i=1nμi2\delta = \sum_{i=1}^n \mu_i^2 is the non-centrality parameter. When δ=0\delta = 0, this reduces to the central chi-squared.

Theorem: Distribution of Quadratic Forms

Let XN(μ,Σ)\mathbf{X} \sim \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma}) with Σ0\boldsymbol{\Sigma} \succ 0. The quadratic form

Q=(Xμ)TΣ1(Xμ)χn2.Q = (\mathbf{X} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\mathbf{X} - \boldsymbol{\mu}) \sim \chi^2_n.

More generally, if A\mathbf{A} is a symmetric n×nn \times n matrix, then XTAX\mathbf{X}^T \mathbf{A} \mathbf{X} is a weighted sum of independent (possibly non-central) chi-squared random variables, with weights given by the eigenvalues of Σ1/2AΣ1/2\boldsymbol{\Sigma}^{1/2}\mathbf{A}\boldsymbol{\Sigma}^{1/2}.

Chi-Squared Distribution Explorer

Explore how the chi-squared PDF changes with the degrees of freedom nn. Observe that for large nn, the distribution becomes approximately Gaussian (by the CLT).

Parameters
5

Definition:

Wishart Distribution

Let X1,,Xm\mathbf{X}_1, \ldots, \mathbf{X}_m be i.i.d. N(0,Σ)\mathcal{N}(\mathbf{0}, \boldsymbol{\Sigma}) random vectors in Rn\mathbb{R}^n. The random matrix

W=i=1mXiXiT\mathbf{W} = \sum_{i=1}^m \mathbf{X}_i \mathbf{X}_i^T

has the Wishart distribution Wn(m,Σ)\mathcal{W}_n(m, \boldsymbol{\Sigma}). When mnm \geq n, W\mathbf{W} is positive definite almost surely. The Wishart is the matrix analogue of the chi-squared.

The sample covariance matrix Σ^=1m1i=1m(XiXˉ)(XiXˉ)T\hat{\boldsymbol{\Sigma}} = \frac{1}{m-1}\sum_{i=1}^m (\mathbf{X}_i - \bar{\mathbf{X}})(\mathbf{X}_i - \bar{\mathbf{X}})^T is a scaled Wishart: (m1)Σ^Wn(m1,Σ)(m-1)\hat{\boldsymbol{\Sigma}} \sim \mathcal{W}_n(m-1, \boldsymbol{\Sigma}).

Wishart distribution

The distribution of the sample scatter matrix W=i=1mXiXiT\mathbf{W} = \sum_{i=1}^m \mathbf{X}_i\mathbf{X}_i^T when the Xi\mathbf{X}_i are i.i.d. Gaussian. The matrix analogue of the chi-squared distribution.

Related: Covariance matrix

Example: Rayleigh Distribution from Bivariate Gaussian

Let (X,Y)N(0,σ2I2)(X, Y) \sim \mathcal{N}(\mathbf{0}, \sigma^2 \mathbf{I}_2). Find the distribution of R=X2+Y2R = \sqrt{X^2 + Y^2}.

Why This Matters: Rayleigh Fading from the Gaussian Model

In a wireless channel with many scatterers and no line-of-sight component, the received complex baseband coefficient is h=X+jYh = X + jY where X,YN(0,σ2)X, Y \sim \mathcal{N}(0, \sigma^2) are independent (by the CLT applied to the sum of many scattered paths). The envelope h=X2+Y2|h| = \sqrt{X^2 + Y^2} is Rayleigh-distributed, and the power h2|h|^2 is exponentially distributed. This is the i.i.d. Rayleigh fading model — the default model for MIMO channels (Book MIMO, Chapter 2) and the starting point for all diversity analysis (Book Telecom, Chapter 10).

⚠️Engineering Note

Sample Covariance Matrix and the Wishart in Practice

In massive MIMO, the base station estimates the channel covariance from mm pilot observations: Σ^=1mi=1mhihiH\hat{\boldsymbol{\Sigma}} = \frac{1}{m}\sum_{i=1}^m \mathbf{h}_i\mathbf{h}_i^H. When mm is comparable to nn (the number of antennas), the sample covariance is a poor estimate of the true covariance — the eigenvalues spread out (the Marchenko-Pastur effect). Understanding the Wishart distribution is essential for analyzing when and how covariance estimation can be trusted. This topic is developed further in the random matrix theory chapter (Chapter 21).

Quick Check

If XN(μ,σ2I4)\mathbf{X} \sim \mathcal{N}(\boldsymbol{\mu}, \sigma^2\mathbf{I}_4), what is the distribution of Xμ2/σ2\|\mathbf{X} - \boldsymbol{\mu}\|^2/\sigma^2?

χ42\chi^2_4

χ32\chi^2_3

N(0,4)\mathcal{N}(0, 4)

Exponential with rate 1/21/2