The Hypergeometric Distribution

From Counting to Distribution

In Chapter 2 we derived the hypergeometric coefficient (Kk)(NKnk)/(Nn)\binom{K}{k}\binom{N-K}{n-k}/\binom{N}{n} as the probability of drawing exactly kk "successes" from a finite population. Now we study this formula as a full-fledged probability distribution and compare it to the binomial. The key question is: when does sampling without replacement behave like sampling with replacement?

Definition:

Hypergeometric Distribution

Let a population of NN items contain KK "successes" and NKN - K "failures." Draw nn items without replacement. The number of successes XX follows the hypergeometric distribution XHyp(N,K,n)X \sim \text{Hyp}(N, K, n) with PMF: P(k)=P(X=k)=(Kk)(NKnk)(Nn),k{max(0,nN+K),,min(n,K)}P(k) = \mathbb{P}(X = k) = \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}, \qquad k \in \{\max(0, n-N+K), \ldots, \min(n, K)\}

,

Theorem: Mean and Variance of the Hypergeometric Distribution

If XHyp(N,K,n)X \sim \text{Hyp}(N, K, n), then: E[X]=nKN,Var(X)=nKNNKNNnN1\mathbb{E}[X] = n \cdot \frac{K}{N}, \qquad \text{Var}(X) = n \cdot \frac{K}{N} \cdot \frac{N-K}{N} \cdot \frac{N-n}{N-1}

The mean is the same as for Bin(n,K/N)\text{Bin}(n, K/N) — on average, the fraction of successes in the sample matches the fraction in the population. The variance is smaller by the factor NnN1\frac{N-n}{N-1}, called the finite population correction. Sampling without replacement reduces variability because each draw constrains the next.

,

Definition:

Finite Population Correction

The factor NnN1\frac{N - n}{N - 1} appearing in the variance of the hypergeometric distribution is called the finite population correction (FPC). It satisfies:

  • FPC =1= 1 when n=1n = 1 (single draw: same as binomial).
  • FPC =0= 0 when n=Nn = N (full census: no variability).
  • FPC 1\to 1 as NN \to \infty with nn fixed (large population: approaches binomial).

In survey sampling, the FPC is applied whenever the sample constitutes more than about 5% of the population. For most wireless applications, the "population" of possible events is effectively infinite and the FPC is negligible.

Theorem: Hypergeometric-to-Binomial Approximation

Let XNHyp(N,KN,n)X_N \sim \text{Hyp}(N, K_N, n) where KN/NpK_N / N \to p as NN \to \infty. Then for every fixed k{0,1,,n}k \in \{0, 1, \ldots, n\}: P(XN=k)(nk)pk(1p)nk\mathbb{P}(X_N = k) \to \binom{n}{k} p^k (1-p)^{n-k} That is, XNX_N converges in distribution to Bin(n,p)\text{Bin}(n, p).

When the population is much larger than the sample, whether we replace each drawn item or not makes negligible difference — the composition of the population barely changes between draws. This is why polls of 1,000 people can represent 300 million: the finite population correction is essentially 1.

,

Hypergeometric vs. Binomial

PropertyHyp(N,K,n)\text{Hyp}(N, K, n)Bin(n,p)\text{Bin}(n, p) with p=K/Np = K/N
SamplingWithout replacementWith replacement
MeannK/Nn K/Nnpnp
Variancenp(1p)NnN1np(1-p) \cdot \frac{N-n}{N-1}np(1p)np(1-p)
Support{max(0,nN+K),,min(n,K)}\{\max(0, n-N+K), \ldots, \min(n,K)\}{0,1,,n}\{0, 1, \ldots, n\}
Independence of drawsNoYes
ApproximationApproaches Bin as NN \to \inftyExact

Hypergeometric vs. Binomial PMF

Compare the hypergeometric PMF (sampling without replacement) to the binomial approximation (sampling with replacement). As NN grows with K/NK/N fixed, the two distributions become indistinguishable.

Parameters
50
15
10

Example: Quality Control Inspection

A shipment of N=100N = 100 components contains K=5K = 5 defective ones. An inspector draws n=10n = 10 components without replacement. What is the probability that the sample contains exactly 1 defective component? Compare with the binomial approximation.

Example: Lottery Probability

In a lottery, 6 numbers are drawn without replacement from {1,,49}\{1, \ldots, 49\}. What is the probability of matching exactly 3 of your 6 chosen numbers?

Why This Matters: Hypergeometric Distribution in Random Access

In grant-free random access for massive IoT, a base station allocates NN pilot sequences. If KK devices are active and each selects a pilot uniformly at random, the number of devices selecting a given subset of nn pilots follows a distribution closely related to the hypergeometric. When NN is large relative to KK, the binomial approximation is accurate, but in overloaded regimes (KNK \gg N), the finite-population effects become significant and the hypergeometric model is more appropriate.

Common Mistake: Forgetting the Support Constraints

Mistake:

Writing P(X=k)\mathbb{P}(X = k) for k=0,1,,nk = 0, 1, \ldots, n without checking that kKk \leq K and nkNKn - k \leq N - K.

Correction:

The hypergeometric PMF is zero outside max(0,nN+K)kmin(n,K)\max(0, n - N + K) \leq k \leq \min(n, K). Always verify the support before computing — especially when nn is close to NN.

Historical Note: Origins of the Hypergeometric Distribution

19th–20th century

The term "hypergeometric" dates to the early 19th century and reflects the connection to the hypergeometric series 2F1(a,b;c;z){}_2F_1(a, b; c; z). The PMF of the hypergeometric distribution can be expressed as a terminating hypergeometric series. The distribution itself was studied implicitly by Laplace and explicitly by the statistician Karl Pearson in the early 1900s, who used it as the exact model for Fisher's "Lady Tasting Tea" experiment — one of the founding examples of hypothesis testing.

Quick Check

When is the finite population correction factor exactly equal to 0?

When n=1n = 1

When NN \to \infty

When n=Nn = N

When K=0K = 0

hypergeometric distribution

The distribution of the number of successes in nn draws without replacement from a population of NN items containing KK successes: P(X=k)=(Kk)(NKnk)/(Nn)\mathbb{P}(X = k) = \binom{K}{k}\binom{N-K}{n-k}/\binom{N}{n}.

Related: binomial distribution, finite population correction

finite population correction

The factor (Nn)/(N1)(N - n)/(N - 1) by which the variance of the hypergeometric distribution is smaller than that of the corresponding binomial.

Related: hypergeometric distribution

binomial distribution

The distribution of the number of successes in nn independent Bernoulli trials with success probability pp: P(X=k)=(nk)pk(1p)nk\mathbb{P}(X = k) = \binom{n}{k} p^k (1-p)^{n-k}.

Related: hypergeometric distribution, Poisson distribution

Key Takeaway

The hypergeometric distribution models sampling without replacement and has the same mean as the binomial but a smaller variance by the finite population correction factor (Nn)/(N1)(N-n)/(N-1). When the population NN is much larger than the sample nn, the two distributions are practically indistinguishable.