Chapter Summary

Key Points

1.
The probability space $(\Omega, \mathcal{F}, \mathbb{P})$ is the fundamental object. The sample space $\Omega$ lists all outcomes; the sigma-algebra $\mathcal{F}$ specifies which subsets can be assigned a probability; the probability measure $\mathbb{P}$ assigns numbers in $[0,1]$ to events in $\mathcal{F}$ . For finite or countably infinite $\Omega$ , use the power set for $\mathcal{F}$ . For uncountable $\Omega$ , use the Borel sigma-algebra.
2.
Kolmogorov's three axioms generate all of probability theory. Non-negativity, normalization, and countable additivity imply: the complementation rule, monotonicity, inclusion-exclusion, the union bound, continuity of probability, and the Borel-Cantelli lemmas. Every formula in probability is a derived consequence of these three rules.
3.
The union bound is the most-used inequality in communications. For any events $A_1, A_2, \ldots$ : $\mathbb{P}(\bigcup A_k) \leq \sum \mathbb{P}(A_k)$ . This bounds block error probability by the sum of pairwise error probabilities, the collision probability in random codebooks, and the probability of any bad event in a union-bound analysis.
4.
The four sampling paradigms cover all combinatorial probability. Ordered/unordered $\times$ with/without replacement gives counts $n^k$ , $n!/(n-k)!$ , $\binom{n+k-1}{k}$ , and $\binom{n}{k}$ . The birthday problem, the hypergeometric distribution, and Shannon's random coding argument all reduce to one of these four cases.
5.
The classical model $\mathbb{P}(A) = |A|/|\Omega|$ requires genuine symmetry. It applies when outcomes are physically equivalent (fair dice, random codeword selection, uniform scheduling) and converts probability to counting. When outcomes are not equally likely, use the general Kolmogorov model.
6.
Sigma-algebras exclude non-measurable sets, not physical events. In practice, every event defined by finitely many inequalities on continuous random variables is a Borel set and hence measurable. The sigma-algebra structure matters theoretically (Vitali's paradox) and computationally (integrals of PDFs are over Borel sets), but never obstructs any physical calculation.

Looking Ahead

Chapter 2 builds the conditional structure on top of this axiom system: conditional probability $\mathbb{P}(A \mid B)$ , the law of total probability, Bayes' theorem, and statistical independence. These concepts are the probabilistic engine of Bayesian inference, channel capacity proofs, and every detection and estimation algorithm in Book FSI.

Chapter 3 connects the counting tools developed here to reliability theory and the probabilistic method — including Shannon's random coding argument as the most celebrated application of "expected value over a random codebook" ever devised.

Finite Probability Spaces and Equally Likely Outcomes Exercises