Finite Probability Spaces and Equally Likely Outcomes

The Classical Model: Symmetry and Equal Likelihood

The classical probability model assigns equal probability to every outcome. It is both the historically oldest probability model (Laplace, 1812) and the one that arises most naturally in information theory and coding: a uniformly chosen codeword, a randomly selected scheduling slot, a uniformly distributed hash input.

The power of the classical model is that it reduces probability to combinatorics: P(A)=A/Ω\mathbb{P}(A) = |A|/|\Omega|. Its limitation is the assumption of symmetry, which must be justified by the problem's physics or structure. When outcomes are not equally likely — a biased channel, a non-uniform source — one must use the general Kolmogorov framework from Section 1.3.

This section develops the classical model, explores when it applies, and uses it to derive several results that will reappear throughout the book.

Definition:

Uniform Probability on a Finite Set

Let Ω={ω1,,ωN}\Omega = \{\omega_1, \ldots, \omega_N\} be a finite sample space with N=ΩN = |\Omega| outcomes. The uniform probability measure on (Ω,2Ω)(\Omega, 2^\Omega) assigns: P({ωi})=1Nfor all i=1,,N.\mathbb{P}(\{\omega_i\}) = \frac{1}{N} \quad \text{for all } i = 1, \ldots, N. For any event AΩA \subseteq \Omega: P(A)=AΩ=AN.\mathbb{P}(A) = \frac{|A|}{|\Omega|} = \frac{|A|}{N}.

The triple (Ω,2Ω,P)(\Omega, 2^\Omega, \mathbb{P}) with this P\mathbb{P} is called the classical probability space or equally likely outcomes model.

We verify Kolmogorov's axioms: non-negativity (A/N0|A|/N \geq 0) and normalization (Ω/N=1|\Omega|/N = 1) are immediate. Countable additivity reduces to finite additivity for finite Ω\Omega: for disjoint A1,,AkA_1, \ldots, A_k, Ai=Ai|\bigcup A_i| = \sum |A_i|, so P(Ai)=P(Ai)\mathbb{P}(\bigcup A_i) = \sum \mathbb{P}(A_i). \checkmark

,

Example: Uniform Distribution on a Die

Roll a fair die (Ω={1,2,3,4,5,6}\Omega = \{1,2,3,4,5,6\}, uniform). (a) Find P(even)\mathbb{P}(\text{even}). (b) Find P(outcome3)\mathbb{P}(\text{outcome} \leq 3). (c) Find P(evenoutcome3)\mathbb{P}(\text{even} \cap \text{outcome} \leq 3). (d) Find P(evenoutcome3)\mathbb{P}(\text{even} \cup \text{outcome} \leq 3).

Example: Random Codebook Selection

Shannon's random coding argument: a binary codebook C\mathcal{C} consists of M=2nRM = 2^{nR} codewords, each chosen independently and uniformly from {0,1}n\{0,1\}^n. Compute the probability that two specified codewords ci\mathbf{c}_i and cj\mathbf{c}_j (iji \neq j) are identical.

Classical Model vs. General Probability Space

PropertyClassical (Uniform) ModelGeneral Kolmogorov Model
Sample spaceFinite Ω={ω1,,ωN}\Omega = \{\omega_1,\ldots,\omega_N\}Finite, countable, or uncountable
Sigma-algebraPower set 2Ω2^\OmegaAny sigma-algebra F\mathcal{F}
Probability of outcome1/N1/N (by symmetry)Arbitrary pi0p_i \geq 0 with pi=1\sum p_i = 1
Probability of eventA/N|A|/N (counting)Sum or integral over AA
Computing P(A)\mathbb{P}(A)CombinatoricsIntegration or summation
When applicableSymmetric/equally likely outcomesAlways (with specified model)
ExampleFair die, random codewordRayleigh fading, Gaussian noise

Why This Matters: Uniform Randomness in Information Theory and Coding

The classical probability model is not just a pedagogical exercise — it is a key structural assumption in information theory. Shannon's channel capacity theorem is proved by showing that a uniformly random codebook achieves a vanishing error probability as the block length nn \to \infty, for any rate R<CR < C. The argument rests on two ingredients: the uniform distribution over codewords and the law of large numbers (which we prove in Chapter 11).

Similarly, in MIMO systems, the capacity-achieving input distribution for an i.i.d. Rayleigh fading channel is Gaussian — but the discrete approximation is a uniform distribution over a QAM constellation. Understanding when "all outcomes equally likely" is a valid model is thus essential for interpreting every capacity formula and every simulation result.

Theorem: Hypergeometric Probability

An urn contains NN balls: KK are red and NKN-K are blue. Draw nn balls without replacement. The probability that exactly kk of the drawn balls are red is: P(X=k)=(Kk)(NKnk)(Nn),k=max(0,n(NK)),,min(n,K).\mathbb{P}(X = k) = \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}, \quad k = \max(0, n-(N-K)), \ldots, \min(n, K).

The denominator counts all equally likely size-nn subsets of the NN balls. The numerator counts subsets with exactly kk red balls: choose kk of the KK red balls ((Kk)\binom{K}{k} ways) and nkn-k of the NKN-K blue balls ((NKnk)\binom{N-K}{n-k} ways).

🎓CommIT Contribution(2021)

Random Coding and the Probabilistic Method in Information Theory

G. CaireFoundations and Trends in Communications and Information Theory

The random coding argument — assigning a uniform probability measure to the codebook ensemble and showing that the average error probability vanishes — is the central proof technique of information theory. It was introduced by Shannon in 1948 and refined in the following decades into a systematic methodology. The probabilistic method here is not just a proof tool but a constructive strategy: a randomly chosen code is likely to be good, even if we cannot exhibit the best code explicitly. This connects directly to the counting arguments in this section: random codebooks are uniform distributions over finite sets, and every bound in the proof uses the inclusion-exclusion and union bound from Section 1.3.

random-codingchannel-capacityinformation-theory

Common Mistake: The Classical Model Requires Genuine Symmetry

Mistake:

A student argues: "There are two outcomes for a biased coin — Heads or Tails — so each has probability 1/2." Or, more subtly: "The channel either makes an error or it doesn't, so the error probability is 1/2."

Correction:

The classical model is valid only when the outcomes are genuinely symmetric — when the physical mechanism treats all outcomes equally. A biased coin has P(H)=p1/2\mathbb{P}(H) = p \neq 1/2; the number of outcomes (Ω=2|\Omega| = 2) does not determine their probabilities. For a channel, the error probability pep_e is a property of the channel and the signaling scheme, not something we can assign by counting. The classical model applies when: (a) a random experiment is explicitly designed to be fair (shuffled cards, fair dice, uniform random codebook), or (b) physical symmetry enforces equal likelihood (crystal symmetry, isotropic noise).

🔧Engineering Note

Uniform Random Scheduling in Wireless Systems

In multi-user wireless systems, a simple and fair scheduling policy is round-robin or uniform random selection: at each time slot, the base station selects one of KK users uniformly at random to serve. Under this classical model, each user is served with probability 1/K1/K per slot.

The probability that a given user is NOT served for nn consecutive slots is (11/K)nen/K(1 - 1/K)^n \leq e^{-n/K} (using 1xex1-x \leq e^{-x}). By the union bound over KK users, the probability that any user waits more than n=2KlnKn = 2K \ln K slots is at most Ken/K=KK2=1/K0K \cdot e^{-n/K} = K \cdot K^{-2} = 1/K \to 0.

This elementary calculation — uniform distribution, union bound, exponential bound — illustrates the practical reach of the tools in this chapter.

Practical Constraints
  • Uniform random scheduling is optimal in throughput only under symmetric channels

  • Proportional fair scheduling (Chapter 11 of Book FSI) adapts weights based on channel quality

Quick Check

A packet of 20 chips contains 5 defective and 15 good ones. An inspector randomly selects 4 chips without replacement. What is the probability that exactly 2 are defective?

(52)(152)/(204)\binom{5}{2}\binom{15}{2}/\binom{20}{4}

(52)(1/4)2(3/4)2\binom{5}{2}(1/4)^2(3/4)^2

(5/20)(4/19)(5/20)(4/19)

(204)/((52)(152))\binom{20}{4} / (\binom{5}{2}\binom{15}{2})

Key Takeaway

The classical model P(A)=A/Ω\mathbb{P}(A) = |A|/|\Omega| converts probability to counting, but requires genuine symmetry. It underpins Shannon's random coding argument, uniform scheduling analysis, and birthday-type collision bounds. Whenever a problem says "uniformly at random" or "equally likely," the classical model applies and the calculation reduces to combinatorics.