Variance and Higher Moments
The Spread Around the Mean
The expectation tells us where the center of a distribution is, but says nothing about how spread out the values are. A random variable that is always equal to its mean and one that fluctuates wildly can have the same expectation. The variance quantifies this spread. It is defined as the expected squared deviation from the mean, and it plays a central role in everything from the central limit theorem to the design of communication systems.
Definition: Variance and Standard Deviation
Variance and Standard Deviation
The variance of a random variable with mean is
The standard deviation is , which has the same units as .
The variance is always non-negative, and if and only if is a constant (i.e., for some ).
Theorem: Variance Shortcut Formula
For any random variable with finite second moment:
This "computational formula" is almost always easier to use than the definition, because computing via LOTUS and separately is typically simpler than computing directly.
Expand the square
$
Apply linearity
$
Theorem: Variance Under Affine Transformation
For constants :
Adding a constant shifts the mean but does not affect the spread.
Direct computation
Let , so .
Definition: Moments and Central Moments
Moments and Central Moments
The -th moment of is (if it exists). The -th central moment is .
- : mean (first moment).
- : the second central moment is the variance.
- : the third central moment, normalized by , gives the skewness , measuring asymmetry.
- : the fourth central moment, normalized by , gives the kurtosis , measuring tail heaviness ( for the Gaussian; excess kurtosis ).
Theorem: Variance of a Sum of Independent Random Variables
If are independent random variables, then
Unlike linearity of expectation, this property requires independence (or at least uncorrelatedness). When variables are positively correlated, the variance of the sum exceeds the sum of the variances; when negatively correlated, it can be smaller.
Expand the variance
Let and . Then
Independence kills cross terms
For , independence gives .
Therefore .
Example: Variance of the Bernoulli Distribution
Compute for .
Compute $\mathbb{E}[X^2]$
Since , we have , so .
Apply the shortcut
.
The variance is maximized at (most uncertain coin) and equals zero at or (deterministic outcomes).
Bernoulli Variance
The variance of a Bernoulli RV as a function of . Maximum uncertainty (variance) occurs at .
Parameters
Common Mistake: Variance of a Sum Requires Independence
Mistake:
Applying when and are dependent.
Correction:
The general formula is . The covariance term vanishes only when and are uncorrelated (which is implied by, but weaker than, independence).
Variance and Heavy Tails
The variance measures the spread of a distribution, but it is sensitive only to the "body" of the distribution. For heavy-tailed distributions — where extreme values occur with non-negligible probability — the variance may be infinite or, even when finite, may not adequately capture the risk of extreme outcomes. In such cases, higher moments (if they exist) or quantile-based measures provide more informative summaries. This issue arises in network traffic modeling, where packet inter-arrival times can exhibit heavy-tailed behavior.
Noise Power Is Variance
In communication systems, the "noise power" of a zero-mean noise process is precisely its variance: . The signal-to-noise ratio (SNR) is the ratio of signal power to noise variance. Reducing noise variance (e.g., by bandwidth filtering or averaging) is the fundamental mechanism by which receivers improve performance.
Quick Check
What is if ?
14
9
4
25
for any constant .
Common Mistake: Standard Deviation Has the Same Units as
Mistake:
Reporting the variance when the units of are, say, seconds — leading to statements like "the variance is 4 seconds" when the correct statement is "the variance is 4 seconds" or equivalently "the standard deviation is 2 seconds."
Correction:
Variance has units of . Standard deviation restores the original units and is more interpretable for practical purposes.
Key Takeaway
The shortcut formula is almost always easier to compute than the definition. Remember: variance of a sum equals the sum of variances only for independent (or uncorrelated) random variables.