Ferkans — Interactive Telecom Tutor

The Spread Around the Mean

The expectation tells us where the center of a distribution is, but says nothing about how spread out the values are. A random variable that is always equal to its mean and one that fluctuates wildly can have the same expectation. The variance quantifies this spread. It is defined as the expected squared deviation from the mean, and it plays a central role in everything from the central limit theorem to the design of communication systems.

Definition:
Variance and Standard Deviation

The variance of a random variable $X$ with mean $\mu = \mathbb{E}[X]$ is

$\text{Var}(X) \triangleq \mathbb{E}\!\left[(X - \mu)^2\right] = \sum_{x \in \mathcal{X}} (x - \mu)^2 \, P(x).$

The standard deviation is $\sigma_X = \sqrt{\text{Var}(X)}$ , which has the same units as $X$ .

The variance is always non-negative, and $\text{Var}(X) = 0$ if and only if $X$ is a constant (i.e., $\mathbb{P}(X = c) = 1$ for some $c$ ).

,

Theorem: Variance Shortcut Formula

For any random variable $X$ with finite second moment:

$\text{Var}(X) = \mathbb{E}[X^2] - \left(\mathbb{E}[X]\right)^2.$

This "computational formula" is almost always easier to use than the definition, because computing $\mathbb{E}[X^2]$ via LOTUS and $\mathbb{E}[X]$ separately is typically simpler than computing $\mathbb{E}[(X - \mu)^2]$ directly.

Proof

Expand the square

$\text{Var}(X) = \mathbb{E}[(X - \mu)^2] = \mathbb{E}[X^2 - 2\mu X + \mu^2].$ $

Apply linearity

$= \mathbb{E}[X^2] - 2\mu\,\mathbb{E}[X] + \mu^2 = \mathbb{E}[X^2] - 2\mu^2 + \mu^2 = \mathbb{E}[X^2] - \mu^2. \quad \blacksquare$ $

Theorem: Variance Under Affine Transformation

For constants $a, b \in \mathbb{R}$ :

$\text{Var}(aX + b) = a^2 \, \text{Var}(X).$

Adding a constant shifts the mean but does not affect the spread.

Proof

Direct computation

Let $\mu = \mathbb{E}[X]$ , so $\mathbb{E}[aX + b] = a\mu + b$ .

$\text{Var}(aX + b) = \mathbb{E}[(aX + b - a\mu - b)^2] = \mathbb{E}[a^2(X - \mu)^2] = a^2 \text{Var}(X). \quad \blacksquare$

Definition:
Moments and Central Moments

The $k$ -th moment of $X$ is $\mathbb{E}[X^k]$ (if it exists). The $k$ -th central moment is $\mathbb{E}[(X - \mu)^k]$ .

$k = 1$ : mean (first moment).
$k = 2$ : the second central moment is the variance.
$k = 3$ : the third central moment, normalized by $\sigma^3$ , gives the skewness $\gamma_1 = \mathbb{E}[(X - \mu)^3] / \sigma^3$ , measuring asymmetry.
$k = 4$ : the fourth central moment, normalized by $\sigma^4$ , gives the kurtosis $\kappa = \mathbb{E}[(X - \mu)^4] / \sigma^4$ , measuring tail heaviness ( $\kappa = 3$ for the Gaussian; excess kurtosis $= \kappa - 3$ ).

Theorem: Variance of a Sum of Independent Random Variables

If $X_1, \ldots, X_n$ are independent random variables, then

$\text{Var}\!\left(\sum_{i=1}^n X_i\right) = \sum_{i=1}^n \text{Var}(X_i).$

Unlike linearity of expectation, this property requires independence (or at least uncorrelatedness). When variables are positively correlated, the variance of the sum exceeds the sum of the variances; when negatively correlated, it can be smaller.

Proof

Expand the variance

Let $S = \sum_i X_i$ and $\mu_S = \sum_i \mu_i$ . Then

$\text{Var}(S) = \mathbb{E}\!\left[\left(\sum_i (X_i - \mu_i)\right)^2\right] = \sum_i \mathbb{E}[(X_i - \mu_i)^2] + \sum_{i \neq j} \mathbb{E}[(X_i - \mu_i)(X_j - \mu_j)].$

Independence kills cross terms

For $i \neq j$ , independence gives $\mathbb{E}[(X_i - \mu_i)(X_j - \mu_j)] = \mathbb{E}[X_i - \mu_i] \cdot \mathbb{E}[X_j - \mu_j] = 0$ .

Therefore $\text{Var}(S) = \sum_i \text{Var}(X_i)$ . $\blacksquare$

,

Example: Variance of the Bernoulli Distribution

Compute $\text{Var}(X)$ for $X \sim \text{Bernoulli}(p)$ .

Solution

Compute $\mathbb{E}[X^2]$

Since $X \in \{0, 1\}$ , we have $X^2 = X$ , so $\mathbb{E}[X^2] = \mathbb{E}[X] = p$ .

Apply the shortcut

$\text{Var}(X) = \mathbb{E}[X^2] - (\mathbb{E}[X])^2 = p - p^2 = p(1 - p)$ .

The variance is maximized at $p = 1/2$ (most uncertain coin) and equals zero at $p = 0$ or $p = 1$ (deterministic outcomes).

Bernoulli Variance $p(1-p)$

The variance $\text{Var}(X) = p(1-p)$ of a Bernoulli RV as a function of $p$ . Maximum uncertainty (variance) occurs at $p = 1/2$ .

Parameters

p

0.5

Common Mistake: Variance of a Sum Requires Independence

Mistake:

Applying $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)$ when $X$ and $Y$ are dependent.

Correction:

The general formula is $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\,\text{Cov}(X, Y)$ . The covariance term vanishes only when $X$ and $Y$ are uncorrelated (which is implied by, but weaker than, independence).

Variance and Heavy Tails

The variance measures the spread of a distribution, but it is sensitive only to the "body" of the distribution. For heavy-tailed distributions — where extreme values occur with non-negligible probability — the variance may be infinite or, even when finite, may not adequately capture the risk of extreme outcomes. In such cases, higher moments (if they exist) or quantile-based measures provide more informative summaries. This issue arises in network traffic modeling, where packet inter-arrival times can exhibit heavy-tailed behavior.

🔧Engineering Note

Noise Power Is Variance

In communication systems, the "noise power" of a zero-mean noise process is precisely its variance: $\sigma^2 = \text{Var}(W) = \mathbb{E}[W^2]$ . The signal-to-noise ratio (SNR) is the ratio of signal power to noise variance. Reducing noise variance (e.g., by bandwidth filtering or averaging) is the fundamental mechanism by which receivers improve performance.

Quick Check

What is $\text{Var}(X + 5)$ if $\text{Var}(X) = 9$ ?

14

9

4

25

Correction:

9

$\text{Var}(X + c) = \text{Var}(X)$ for any constant $c$ .

Common Mistake: Standard Deviation Has the Same Units as $X$

Mistake:

Reporting the variance when the units of $X$ are, say, seconds — leading to statements like "the variance is 4 seconds" when the correct statement is "the variance is 4 seconds $^2$ " or equivalently "the standard deviation is 2 seconds."

Correction:

Variance has units of $X^2$ . Standard deviation $\sigma_X = \sqrt{\text{Var}(X)}$ restores the original units and is more interpretable for practical purposes.

Variance

$\text{Var}(X) = \mathbb{E}[(X - \mathbb{E}[X])^2] = \mathbb{E}[X^2] - (\mathbb{E}[X])^2$ . Measures the expected squared deviation from the mean.

Related: Expectation

Key Takeaway

The shortcut formula $\text{Var}(X) = \mathbb{E}[X^2] - (\mathbb{E}[X])^2$ is almost always easier to compute than the definition. Remember: variance of a sum equals the sum of variances only for independent (or uncorrelated) random variables.

Variance and Higher Moments

The Spread Around the Mean

Definition: Variance and Standard Deviation

Theorem: Variance Shortcut Formula

Expand the square

Apply linearity

Theorem: Variance Under Affine Transformation

Direct computation

Definition: Moments and Central Moments

Theorem: Variance of a Sum of Independent Random Variables

Expand the variance

Independence kills cross terms

Example: Variance of the Bernoulli Distribution

Compute $\mathbb{E}[X^2]$

Apply the shortcut

Bernoulli Variance p(1−p)p(1-p)p(1−p)

Parameters

Common Mistake: Variance of a Sum Requires Independence

Variance and Heavy Tails

Noise Power Is Variance

Quick Check

Common Mistake: Standard Deviation Has the Same Units as XXX

Variance

Key Takeaway

Definition:
Variance and Standard Deviation

Definition:
Moments and Central Moments

Bernoulli Variance $p(1-p)$

Common Mistake: Standard Deviation Has the Same Units as $X$