Repeated Independent Trials

Building Distributions from Independent Trials

The most widely used probability distributions in communications β€” binomial, geometric, negative binomial β€” arise naturally from repeating a simple binary experiment (Bernoulli trial) multiple times, independently. Rather than postulating these distributions axiomatically, we derive them from first principles: a single parameter pp (the success probability) generates a whole family of distributions by asking different questions about the repeated experiment.

This derivation is important because it shows exactly which independence assumptions underlie each distribution. When those assumptions fail β€” bursty errors, correlated channel states β€” these distributions no longer apply and must be replaced.

Definition:

Bernoulli Trial and Bernoulli Sequence

A Bernoulli trial is an experiment with exactly two outcomes: success (S) and failure (F), with P(S)=p\mathbb{P}(S) = p and P(F)=1βˆ’p\mathbb{P}(F) = 1-p for some p∈[0,1]p \in [0,1].

A Bernoulli sequence is an infinite sequence of independent, identically distributed Bernoulli trials (X1,X2,…)(X_1, X_2, \ldots) where each Xi=1X_i = 1 (success) with probability pp and Xi=0X_i = 0 (failure) with probability 1βˆ’p1-p.

The i.i.d. assumption is the key structural property. In a Bernoulli sequence, the outcome of any trial provides no information about any other trial β€” neither past nor future. This is the memoryless property at the sequence level.

Theorem: Binomial Distribution

In a Bernoulli sequence with parameter pp, let Sn=X1+β‹―+XnS_n = X_1 + \cdots + X_n be the number of successes in nn trials. Then SnS_n has the binomial distribution Bin(n,p)\text{Bin}(n, p) with probability mass function P(Sn=k)=(nk)pk(1βˆ’p)nβˆ’k,k=0,1,…,n.\mathbb{P}(S_n = k) = \binom{n}{k} p^k (1-p)^{n-k}, \qquad k = 0, 1, \ldots, n. The mean and variance are E[Sn]=np\mathbb{E}[S_n] = np and Var(Sn)=np(1βˆ’p)\text{Var}(S_n) = np(1-p).

Any specific sequence of kk successes and nβˆ’kn-k failures has probability pk(1βˆ’p)nβˆ’kp^k(1-p)^{n-k}. The factor (nk)\binom{n}{k} counts the number of such sequences. Since they are mutually exclusive and exhaustive, the probabilities sum to (p+(1βˆ’p))n=1(p + (1-p))^n = 1 by the binomial theorem.

,

Definition:

Geometric Distribution

In a Bernoulli sequence with parameter p>0p > 0, let TT be the waiting time until the first success: T=min⁑{nβ‰₯1:Xn=1}T = \min\{n \geq 1 : X_n = 1\}. Then TT has the geometric distribution Geom(p)\text{Geom}(p) with PMF P(T=k)=(1βˆ’p)kβˆ’1p,k=1,2,3,…\mathbb{P}(T = k) = (1-p)^{k-1} p, \qquad k = 1, 2, 3, \ldots and mean E[T]=1/p\mathbb{E}[T] = 1/p.

The geometric distribution is the unique discrete distribution with the memoryless property: P(T>m+n∣T>m)=P(T>n)\mathbb{P}(T > m+n \mid T > m) = \mathbb{P}(T > n). Knowing that no success has occurred in the first mm trials does not change the distribution of the remaining waiting time. This mirrors the continuous memoryless property of the exponential distribution.

Theorem: Memoryless Property of the Geometric Distribution

Let T∼Geom(p)T \sim \text{Geom}(p). For any integers m,nβ‰₯1m, n \geq 1: P(T>m+n∣T>m)=P(T>n).\mathbb{P}(T > m + n \mid T > m) = \mathbb{P}(T > n). Conversely, the geometric distribution is the only discrete distribution on {1,2,3,…}\{1, 2, 3, \ldots\} with this property.

Definition:

Negative Binomial Distribution

In a Bernoulli sequence with parameter pp, let TrT_r be the waiting time until the rr-th success. Then TrT_r has the negative binomial distribution NegBin(r,p)\text{NegBin}(r, p) with PMF P(Tr=k)=(kβˆ’1rβˆ’1)pr(1βˆ’p)kβˆ’r,k=r,r+1,r+2,…\mathbb{P}(T_r = k) = \binom{k-1}{r-1} p^r (1-p)^{k-r}, \qquad k = r, r+1, r+2, \ldots Mean: E[Tr]=r/p\mathbb{E}[T_r] = r/p. Variance: Var(Tr)=r(1βˆ’p)/p2\text{Var}(T_r) = r(1-p)/p^2.

The name "negative binomial" comes from the fact that the PMF arises from the binomial series with negative exponent. The special case r=1r = 1 is the geometric distribution. The sum of rr independent Geom(p)(p) random variables has the NegBin(r,p)(r, p) distribution.

Example: Binomial Model for Bit Error Count

A BPSK link has bit error probability p=10βˆ’3p = 10^{-3}. A packet contains n=1000n = 1000 bits. Compute (a) the expected number of bit errors, (b) the probability of zero errors, and (c) the probability of more than 2 errors.

Binomial Distribution from Bernoulli Trials

Visualize the textBin(n,p)\\text{Bin}(n, p) PMF and how it evolves as nn and pp vary. Notice the approach to a Gaussian bell curve as nn increases (central limit theorem preview).

Parameters
10
0.3

Probability Tree for Binomial Trials

A branching probability tree for nn Bernoulli trials, showing how the binomial distribution is built path by path.
Each level of the tree corresponds to one trial. The probability at each leaf is pk(1βˆ’p)nβˆ’kp^k(1-p)^{n-k}. The binomial coefficient (nk)\binom{n}{k} counts the leaves at height kk.

Monty Hall Simulation

Simulate the Monty Hall problem: nn rounds, switch vs. stay strategy. Watch the empirical win rate converge to the theoretical probabilities 2/32/3 (switch) and 1/31/3 (stay). This is an application of Bayes' theorem: after the host reveals a goat, the posterior probability of the unchosen door having the car increases.

Parameters
10000
⚠️Engineering Note

Packet Error Rate and the Binomial Model

The binomial model for packet errors assumes independent bit errors β€” a valid approximation when the channel uses interleaving to break burst correlation. The packet error rate (PER) for a packet of nn bits and bit error rate pbp_b is: PER=1βˆ’(1βˆ’pb)nβ‰ˆnpbforΒ npbβ‰ͺ1.\text{PER} = 1 - (1-p_b)^n \approx n p_b \quad \text{for } n p_b \ll 1. Without forward error correction (FEC), a 10001000-bit packet at pb=10βˆ’3p_b = 10^{-3} has PERβ‰ˆ0.632\text{PER} \approx 0.632. With a rate-1/21/2 convolutional code (nβ†’2nn \to 2n coded bits) and interleaving, the effective pbp_b after decoding can fall below 10βˆ’610^{-6}, reducing PER to β‰ˆ0.002\approx 0.002.

Practical Constraints
  • β€’

    LTE/5G NR target PER is ≀10%\leq 10\% at SINR thresholds specified in TS 38.214

  • β€’

    Independent error assumption requires interleaver depth β‰₯\geq coherence time

  • β€’

    HARQ retransmissions can recover from occasional packet errors at the system level

Binomial Distribution

The number of successes in nn independent Bernoulli(p)(p) trials. PMF: P(k)=(nk)pk(1βˆ’p)nβˆ’k\mathbb{P}(k) = \binom{n}{k}p^k(1-p)^{n-k}. Mean npnp, variance np(1βˆ’p)np(1-p).

Related: Bernoulli Trial and Bernoulli Sequence, Geometric Distribution, Poisson Approximation

Geometric Distribution

Waiting time until the first success in a Bernoulli(p)(p) sequence. PMF: P(T=k)=(1βˆ’p)kβˆ’1p\mathbb{P}(T=k) = (1-p)^{k-1}p. Mean 1/p1/p. The unique memoryless distribution on {1,2,…}\{1,2,\ldots\}.

Related: Bernoulli Trial and Bernoulli Sequence, Memoryless Property, Exponential Distribution

Quick Check

A transmission system has word error rate p=0.1p = 0.1. In n=5n = 5 independent transmissions, what is the probability of exactly 2 errors?

0.12Γ—0.930.1^2 \times 0.9^3

5Γ—0.12Γ—0.935 \times 0.1^2 \times 0.9^3

(52)Γ—0.12Γ—0.93β‰ˆ0.073\binom{5}{2} \times 0.1^2 \times 0.9^3 \approx 0.073

2Γ—0.12Γ—0.932 \times 0.1^2 \times 0.9^3

Key Takeaway

Three distributions, one experiment. Bernoulli trials generate the binomial (how many successes in nn trials?), geometric (how long until the first success?), and negative binomial (how long until the rr-th success?) distributions β€” all from the same parameter pp. The geometric distribution is the only discrete memoryless distribution, paralleling the exponential distribution in continuous time. In error probability analysis, these distributions quantify performance under the i.i.d. error model.