Ferkans — Interactive Telecom Tutor

Building Distributions from Independent Trials

The most widely used probability distributions in communications — binomial, geometric, negative binomial — arise naturally from repeating a simple binary experiment (Bernoulli trial) multiple times, independently. Rather than postulating these distributions axiomatically, we derive them from first principles: a single parameter $p$ (the success probability) generates a whole family of distributions by asking different questions about the repeated experiment.

This derivation is important because it shows exactly which independence assumptions underlie each distribution. When those assumptions fail — bursty errors, correlated channel states — these distributions no longer apply and must be replaced.

Definition:
Bernoulli Trial and Bernoulli Sequence

A Bernoulli trial is an experiment with exactly two outcomes: success (S) and failure (F), with $\mathbb{P}(S) = p$ and $\mathbb{P}(F) = 1-p$ for some $p \in [0,1]$ .

A Bernoulli sequence is an infinite sequence of independent, identically distributed Bernoulli trials $(X_1, X_2, \ldots)$ where each $X_i = 1$ (success) with probability $p$ and $X_i = 0$ (failure) with probability $1-p$ .

The i.i.d. assumption is the key structural property. In a Bernoulli sequence, the outcome of any trial provides no information about any other trial — neither past nor future. This is the memoryless property at the sequence level.

Theorem: Binomial Distribution

In a Bernoulli sequence with parameter $p$ , let $S_n = X_1 + \cdots + X_n$ be the number of successes in $n$ trials. Then $S_n$ has the binomial distribution $\text{Bin}(n, p)$ with probability mass function $\mathbb{P}(S_n = k) = \binom{n}{k} p^k (1-p)^{n-k}, \qquad k = 0, 1, \ldots, n.$ The mean and variance are $\mathbb{E}[S_n] = np$ and $\text{Var}(S_n) = np(1-p)$ .

Any specific sequence of $k$ successes and $n-k$ failures has probability $p^k(1-p)^{n-k}$ . The factor $\binom{n}{k}$ counts the number of such sequences. Since they are mutually exclusive and exhaustive, the probabilities sum to $(p + (1-p))^n = 1$ by the binomial theorem.

Proof

Probability of a specific sequence

By independence, a specific ordered outcome with $k$ S's and $n-k$ F's has probability $p^k (1-p)^{n-k}$ (multiply the probabilities of each trial).

Count the sequences

The number of ordered sequences of length $n$ containing exactly $k$ successes equals the number of ways to choose $k$ positions from $n$ for the successes: $\binom{n}{k} = \frac{n!}{k!(n-k)!}$ .

Sum over mutually exclusive sequences

The event $\{S_n = k\}$ is the union of all $\binom{n}{k}$ specific sequences, which are mutually exclusive. Therefore: $\mathbb{P}(S_n = k) = \binom{n}{k} p^k (1-p)^{n-k}. \qquad\blacksquare$

Mean and variance

By linearity of expectation, $\mathbb{E}[S_n] = \sum_{i=1}^n \mathbb{E}[X_i] = np$ . By independence, $\text{Var}(S_n) = \sum_{i=1}^n \text{Var}(X_i) = np(1-p)$ (since $\text{Var}(X_i) = p(1-p)$ for a Bernoulli $\,(p)$ variable). $\blacksquare$

,

Definition:
Geometric Distribution

In a Bernoulli sequence with parameter $p > 0$ , let $T$ be the waiting time until the first success: $T = \min\{n \geq 1 : X_n = 1\}$ . Then $T$ has the geometric distribution $\text{Geom}(p)$ with PMF $\mathbb{P}(T = k) = (1-p)^{k-1} p, \qquad k = 1, 2, 3, \ldots$ and mean $\mathbb{E}[T] = 1/p$ .

The geometric distribution is the unique discrete distribution with the memoryless property: $\mathbb{P}(T > m+n \mid T > m) = \mathbb{P}(T > n)$ . Knowing that no success has occurred in the first $m$ trials does not change the distribution of the remaining waiting time. This mirrors the continuous memoryless property of the exponential distribution.

Theorem: Memoryless Property of the Geometric Distribution

Let $T \sim \text{Geom}(p)$ . For any integers $m, n \geq 1$ : $\mathbb{P}(T > m + n \mid T > m) = \mathbb{P}(T > n).$ Conversely, the geometric distribution is the only discrete distribution on $\{1, 2, 3, \ldots\}$ with this property.

Proof

Compute the tail probability

$\mathbb{P}(T > k) = \sum_{j=k+1}^{\infty}(1-p)^{j-1}p = (1-p)^k$ . (Geometric series: $(1-p)^k p / (1-(1-p)) = (1-p)^k$ .)

Verify the memoryless property

$\mathbb{P}(T > m+n \mid T > m) = \frac{\mathbb{P}(T > m+n)}{\mathbb{P}(T > m)} = \frac{(1-p)^{m+n}}{(1-p)^m} = (1-p)^n = \mathbb{P}(T > n). \qquad\blacksquare$ $

Uniqueness (sketch)

Suppose $\mathbb{P}(T > m+n) = \mathbb{P}(T > m)\mathbb{P}(T > n)$ for all $m,n$ . Setting $f(k) = \mathbb{P}(T > k)$ , we have $f(m+n) = f(m)f(n)$ with $f(0) = 1$ and $f(k) \to 0$ . The only such function on the non-negative integers is $f(k) = r^k$ for some $r \in (0,1)$ . Setting $r = 1-p$ recovers the geometric distribution. $\blacksquare$

Definition:
Negative Binomial Distribution

In a Bernoulli sequence with parameter $p$ , let $T_r$ be the waiting time until the $r$ -th success. Then $T_r$ has the negative binomial distribution $\text{NegBin}(r, p)$ with PMF $\mathbb{P}(T_r = k) = \binom{k-1}{r-1} p^r (1-p)^{k-r}, \qquad k = r, r+1, r+2, \ldots$ Mean: $\mathbb{E}[T_r] = r/p$ . Variance: $\text{Var}(T_r) = r(1-p)/p^2$ .

The name "negative binomial" comes from the fact that the PMF arises from the binomial series with negative exponent. The special case $r = 1$ is the geometric distribution. The sum of $r$ independent Geom $(p)$ random variables has the NegBin $(r, p)$ distribution.

Example: Binomial Model for Bit Error Count

A BPSK link has bit error probability $p = 10^{-3}$ . A packet contains $n = 1000$ bits. Compute (a) the expected number of bit errors, (b) the probability of zero errors, and (c) the probability of more than 2 errors.

Solution

Model

Assume errors occur independently (Bernoulli sequence). The number of errors $S \sim \text{Bin}(1000, 10^{-3})$ .

(a) Expected errors

$\mathbb{E}[S] = np = 1000 \times 10^{-3} = 1$ .

(b) Zero errors

$\mathbb{P}(S = 0) = (1-p)^n = (0.999)^{1000} \approx e^{-1} \approx 0.368.$ $(Using$ (1-1/n)^n \approx e^{-1} $for large$ n$.)

(c) More than 2 errors

$\mathbb{P}(S > 2) = 1 - \mathbb{P}(S \leq 2)$ . Since $np = 1$ is moderate and $n$ is large, the Poisson approximation $S \approx \text{Pois}(1)$ gives: $\mathbb{P}(S \leq 2) \approx e^{-1}(1 + 1 + 1/2) = 2.5e^{-1} \approx 0.920$ . So $\mathbb{P}(S > 2) \approx 0.080$ .

Binomial Distribution from Bernoulli Trials

Visualize the $\\text{Bin}(n, p)$ PMF and how it evolves as $n$ and $p$ vary. Notice the approach to a Gaussian bell curve as $n$ increases (central limit theorem preview).

Parameters

Number of trials

n

10

Success probability

p

0.3

Overlay Gaussian approximation

Probability Tree for Binomial Trials

A branching probability tree for

n

Bernoulli trials, showing how the binomial distribution is built path by path.

Each level of the tree corresponds to one trial. The probability at each leaf is

p^k(1-p)^{n-k}

. The binomial coefficient

\binom{n}{k}

counts the leaves at height

k

.

Monty Hall Simulation

Simulate the Monty Hall problem: $n$ rounds, switch vs. stay strategy. Watch the empirical win rate converge to the theoretical probabilities $2/3$ (switch) and $1/3$ (stay). This is an application of Bayes' theorem: after the host reveals a goat, the posterior probability of the unchosen door having the car increases.

Parameters

Number of trials10000

Strategy

⚠️Engineering Note

Packet Error Rate and the Binomial Model

The binomial model for packet errors assumes independent bit errors — a valid approximation when the channel uses interleaving to break burst correlation. The packet error rate (PER) for a packet of $n$ bits and bit error rate $p_b$ is: $\text{PER} = 1 - (1-p_b)^n \approx n p_b \quad \text{for } n p_b \ll 1.$ Without forward error correction (FEC), a $1000$ -bit packet at $p_b = 10^{-3}$ has $\text{PER} \approx 0.632$ . With a rate- $1/2$ convolutional code ( $n \to 2n$ coded bits) and interleaving, the effective $p_b$ after decoding can fall below $10^{-6}$ , reducing PER to $\approx 0.002$ .

Practical Constraints

•
LTE/5G NR target PER is $\leq 10\%$ at SINR thresholds specified in TS 38.214
•
Independent error assumption requires interleaver depth $\geq$ coherence time
•
HARQ retransmissions can recover from occasional packet errors at the system level

Binomial Distribution

The number of successes in $n$ independent Bernoulli $(p)$ trials. PMF: $\mathbb{P}(k) = \binom{n}{k}p^k(1-p)^{n-k}$ . Mean $np$ , variance $np(1-p)$ .

Geometric Distribution

Waiting time until the first success in a Bernoulli $(p)$ sequence. PMF: $\mathbb{P}(T=k) = (1-p)^{k-1}p$ . Mean $1/p$ . The unique memoryless distribution on $\{1,2,\ldots\}$ .

Quick Check

A transmission system has word error rate $p = 0.1$ . In $n = 5$ independent transmissions, what is the probability of exactly 2 errors?

$0.1^2 \times 0.9^3$

$5 \times 0.1^2 \times 0.9^3$

$\binom{5}{2} \times 0.1^2 \times 0.9^3 \approx 0.073$

$2 \times 0.1^2 \times 0.9^3$

Correction:

\binom{5}{2} \times 0.1^2 \times 0.9^3 \approx 0.073

$\mathbb{P}(S=2) = \binom{5}{2}(0.1)^2(0.9)^3 = 10 \times 0.01 \times 0.729 = 0.0729$ .

Key Takeaway

Three distributions, one experiment. Bernoulli trials generate the binomial (how many successes in $n$ trials?), geometric (how long until the first success?), and negative binomial (how long until the $r$ -th success?) distributions — all from the same parameter $p$ . The geometric distribution is the only discrete memoryless distribution, paralleling the exponential distribution in continuous time. In error probability analysis, these distributions quantify performance under the i.i.d. error model.

Repeated Independent Trials

Building Distributions from Independent Trials

Definition: Bernoulli Trial and Bernoulli Sequence

Theorem: Binomial Distribution

Probability of a specific sequence

Count the sequences

Sum over mutually exclusive sequences

Mean and variance

Definition: Geometric Distribution

Theorem: Memoryless Property of the Geometric Distribution

Compute the tail probability

Verify the memoryless property

Uniqueness (sketch)

Definition: Negative Binomial Distribution

Example: Binomial Model for Bit Error Count

Model

(a) Expected errors

(b) Zero errors

(c) More than 2 errors

Binomial Distribution from Bernoulli Trials

Parameters

Probability Tree for Binomial Trials

Monty Hall Simulation

Parameters

Packet Error Rate and the Binomial Model

Binomial Distribution

Geometric Distribution

Quick Check

Key Takeaway

Definition:
Bernoulli Trial and Bernoulli Sequence

Definition:
Geometric Distribution

Definition:
Negative Binomial Distribution