Independence of Random Variables

Why Independence Is the Central Assumption

Independence is the single most important structural assumption in probability and its applications. The entire machinery of information theory β€” i.i.d. sources, memoryless channels, random codebooks β€” is built on independence. When it holds, computations simplify dramatically: joint distributions factor, expectations of products equal products of expectations, and the variance of a sum equals the sum of variances. When it fails, the analysis becomes harder, but the deviations from independence are often the most interesting part of the problem.

Definition:

Independence of Random Variables

Random variables XX and YY are independent if for all x,y∈Rx, y \in \mathbb{R}:

FX,Y(x,y)=FX(x)β‹…FY(y).F_{X,Y}(x, y) = F_{X}(x) \cdot F_{Y}(y).

Equivalently:

  • Discrete case: PX,Y(xi,yj)=PX(xi)β‹…PY(yj)P_{X,Y}(x_i, y_j) = P_{X}(x_i) \cdot P_{Y}(y_j) for all i,ji, j.
  • Continuous case: fX,Y(x,y)=fX(x)β‹…fY(y)f_{X,Y}(x, y) = f_{X}(x) \cdot f_{Y}(y) for all x,yx, y.

A collection X1,X2,…,XnX_1, X_2, \ldots, X_n is (mutually) independent if the joint CDF (or PMF, or PDF) factors into a product of marginals for every subset of the collection.

Pairwise independence does not imply mutual independence. The same subtlety we encountered for events in Chapter 2 carries over to random variables.

Theorem: Functions of Independent RVs Are Independent

If XX and YY are independent random variables and g,hg, h are Borel-measurable functions, then g(X)g(X) and h(Y)h(Y) are independent.

Theorem: Product of Expectations

If XX and YY are independent and E[∣X∣],E[∣Y∣]<∞\mathbb{E}[|X|], \mathbb{E}[|Y|] < \infty, then

E[XY]=E[X]β‹…E[Y].\mathbb{E}[XY] = \mathbb{E}[X] \cdot \mathbb{E}[Y].

More generally, for any measurable g,hg, h with E[∣g(X)∣],E[∣h(Y)∣]<∞\mathbb{E}[|g(X)|], \mathbb{E}[|h(Y)|] < \infty:

E[g(X) h(Y)]=E[g(X)]β‹…E[h(Y)].\mathbb{E}[g(X)\,h(Y)] = \mathbb{E}[g(X)] \cdot \mathbb{E}[h(Y)].

Example: Poisson Splitting Property

A coin is tossed NN times where N∼Poisson(Ξ»)N \sim \text{Poisson}(\lambda). Each toss independently lands heads with probability pp. Let XX = number of heads and YY = number of tails. Show that XX and YY are independent with X∼Poisson(Ξ»p)X \sim \text{Poisson}(\lambda p) and Y∼Poisson(Ξ»(1βˆ’p))Y \sim \text{Poisson}(\lambda(1-p)).

Common Mistake: Uncorrelated Does Not Imply Independent

Mistake:

Concluding that XX and YY are independent because Cov(X,Y)=0\text{Cov}(X, Y) = 0.

Correction:

Uncorrelatedness means E[XY]=E[X]E[Y]\mathbb{E}[XY] = \mathbb{E}[X]\mathbb{E}[Y], which is a statement about second moments only. Independence is a much stronger condition: it requires the entire joint distribution to factor. A classic counterexample: let X∼N(0,1)X \sim \mathcal{N}(0,1) and Y=X2Y = X^2. Then Cov(X,Y)=E[X3]=0\text{Cov}(X, Y) = \mathbb{E}[X^3] = 0 (by symmetry), but YY is completely determined by XX.

The one important exception: for jointly Gaussian random variables, uncorrelated does imply independent. This is a special property of the Gaussian distribution, developed in Chapter 8.

Independence vs. Uncorrelatedness

PropertyIndependentUncorrelated
Formal conditionFX,Y(x,y)=FX(x) FY(y)F_{X,Y}(x,y) = F_{X}(x)\,F_{Y}(y) for all x,yx,yE[XY]=E[X] E[Y]\mathbb{E}[XY] = \mathbb{E}[X]\,\mathbb{E}[Y]
What it constrainsEntire joint distributionSecond-order moments only
Implies the other?Independence β‡’\Rightarrow uncorrelatedUncorrelated β‡’ΜΈ\not\Rightarrow independent (in general)
Exceptionβ€”Jointly Gaussian: uncorrelated ⇔\Leftrightarrow independent
Var(X+Y)=Var(X)+Var(Y)\text{Var}(X+Y) = \text{Var}(X) + \text{Var}(Y)?YesYes (both suffice)

Independent random variables

Random variables whose joint CDF (or PMF/PDF) factors as a product of marginals. Independence means knowing the value of one provides no information about the other.

Related: Joint probability density function, Conditional expectation

Quick Check

If fX,Y(x,y)=eβˆ’xeβˆ’yf_{X,Y}(x,y) = e^{-x} e^{-y} for x,yβ‰₯0x, y \ge 0 and zero otherwise, are XX and YY independent?

Yes, because the joint PDF factors as eβˆ’xβ‹…eβˆ’ye^{-x} \cdot e^{-y}.

No, because they are both exponential.

Cannot determine without computing the CDF.

Historical Note: The Formalization of Independence

1933

The concept of independence has been used informally since the earliest work on games of chance. But its rigorous mathematical definition β€” as the factorization of a joint distribution β€” was established by Kolmogorov in his 1933 monograph Grundbegriffe der Wahrscheinlichkeitsrechnung. Kolmogorov's formalization made it possible to state precisely when two random variables "have nothing to do with each other" and to derive consequences such as the strong law of large numbers and the central limit theorem.

Key Takeaway

Independence of random variables means the joint distribution factors as a product of marginals. It implies uncorrelatedness, but the converse is false except for jointly Gaussian RVs. Independence is the key structural assumption that makes most of information theory and performance analysis tractable.