The Multivariate Gaussian Distribution
Why the Multivariate Gaussian Is Central
The multivariate Gaussian is the single most important distribution in engineering. There are at least three reasons. First, the Central Limit Theorem (Chapter 11) guarantees that the sum of many small independent effects is approximately Gaussian — and thermal noise, aggregate interference, and quantization error are all such sums. Second, the Gaussian is the maximum-entropy distribution for a given mean and covariance — so when we know only these two statistics, the Gaussian is the most "conservative" (least committal) model. Third, and most remarkably, the Gaussian family is closed under a rich set of operations: linear transformation, marginalization, and conditioning all produce Gaussian results. This closure makes the entire machinery of LMMSE estimation, Kalman filtering, and MIMO capacity analysis tractable.
Definition: Multivariate Gaussian Distribution
Multivariate Gaussian Distribution
A random vector has the multivariate Gaussian (or normal) distribution with mean and covariance , written , if its joint PDF is
where .
The matrix is called the precision matrix or information matrix. It appears naturally in the exponent of the Gaussian PDF and plays a central role in graphical models, where a zero entry indicates conditional independence of and given all other components.
Precision matrix
The inverse of the covariance matrix, . For a multivariate Gaussian, a zero entry means and are conditionally independent given all remaining variables.
Related: Covariance matrix
Multivariate Gaussian distribution
The distribution , fully parameterized by its mean vector and covariance matrix. Uniquely characterized among all distributions by the property that every linear combination of its components is a scalar Gaussian.
Related: Precision matrix, Covariance matrix
The Mahalanobis Distance
The exponent in the Gaussian PDF involves the quadratic form
known as the Mahalanobis distance (squared) from to . Contours of constant density are ellipsoids . The shape and orientation of these ellipsoids are determined by the eigenvectors and eigenvalues of : the principal axes lie along the eigenvectors, and the half-axis lengths are proportional to .
2D Gaussian Contour Explorer
Explore how the correlation coefficient and the variances shape the density contours of a bivariate Gaussian. The ellipses tilt as moves away from zero.
Parameters
Historical Note: Gauss, Bravais, and the Bivariate Normal
18th–20th centuryThe univariate Gaussian distribution was introduced by Abraham de Moivre in 1733 and later developed by Gauss in the context of astronomical error analysis (1809). The bivariate extension, with the correlation coefficient as a parameter, was studied systematically by Auguste Bravais (1846) and later by Francis Galton, who used it to model regression. The general -dimensional Gaussian was formalized in the early 20th century, but its full power was not appreciated until the development of multivariate statistics by Wishart, Hotelling, and Anderson in the 1930s–1950s.
Example: The Bivariate Gaussian PDF
Let with
Write the joint PDF explicitly and identify the level curves.
Compute the precision matrix
$
Write the PDF
$
Identify level curves
Setting the exponent equal to gives the ellipse . When , the ellipses become circles; when , the major axis tilts toward the line .
Gaussian Contours as Varies
Common Mistake: Singular Covariance Matrix
Mistake:
Assuming the multivariate Gaussian PDF always exists. When , the distribution is supported on a lower-dimensional affine subspace and has no density with respect to Lebesgue measure on .
Correction:
If , then components of are deterministic affine functions of the remaining . One can still define the Gaussian via its characteristic function (see Section 6). In practice, this occurs when measurements are linearly dependent — for instance, when an antenna array has redundant elements.
Key Takeaway
The multivariate Gaussian is completely specified by its mean vector and covariance matrix. No distribution with the same first two moments can have more entropy. This is why, in the absence of additional information, the Gaussian is the default model throughout engineering.
Quick Check
A random vector has the distribution . How many free parameters does this distribution have?
5 (mean) + 25 (covariance) = 30
5 (mean) + 15 (covariance) = 20
5 (mean) + 10 (covariance) = 15
25 (the entire covariance matrix)
The mean vector has 5 entries and the symmetric covariance matrix has free entries.