The Probabilistic Method
Existence Proofs Without Explicit Construction
The probabilistic method is one of the most powerful techniques in combinatorics. The idea is disarmingly simple: to prove that a combinatorial object with some property exists, we define a random experiment over all objects and show that the expected value of some quantity makes it impossible for every object to violate the property.
The argument never constructs the object — it merely proves one must exist. This is existence proof via probability. Paul Erdős pioneered this method in the 1940s and 1950s, and Shannon used it in 1948 to prove the channel coding theorem — perhaps its most consequential application.
Theorem: The Probabilistic Method
Let be a finite set and a random variable uniformly distributed over . If for some function , then there exists at least one such that .
Equivalently: if , then , and in particular some satisfies .
The average value of is below ; therefore cannot exceed everywhere — at least one point must be below the average. This trivial observation, applied to the right random experiment, yields non-trivial existence results.
If for every , what is the minimum possible value of ?
Contrapose: assume every has . Then , contradicting .
Proof by contradiction
Suppose for contradiction that for every . Then This contradicts . Therefore there exists with .
Key Takeaway
The probabilistic method converts a lower bound on into an existence guarantee. The random experiment is a proof device, not a description of how to find the good object.
Historical Note: Erdős and the Birth of the Probabilistic Method
1947–presentPaul Erdős (1913–1996) introduced the probabilistic method in a 1947 paper showing that Ramsey numbers — meaning that large graphs can simultaneously avoid both a clique of size and an independent set of size , which seems impossible to construct but is easy to prove via a random coloring argument. Erdős had no explicit construction; he only showed that a random coloring succeeds with positive probability. Over the following decades, he and collaborators applied this technique to dozens of seemingly intractable combinatorial problems. The method has since become foundational in coding theory, cryptography, and algorithm design.
Definition: Random Codebook
Random Codebook
A binary random codebook of rate and blocklength is generated by independently choosing codewords , each component drawn i.i.d.\ from .
The codebook is a random object. A fixed codebook is any specific realization of this process. Shannon's argument shows that the expected error probability over the random ensemble is small whenever (the channel capacity), proving a good codebook exists.
This definition will be revisited and made precise in Book ITA. Here we emphasize the probabilistic existence argument only.
Theorem: Shannon's Random Coding Existence Argument (Preview)
Consider a binary symmetric channel (BSC) with crossover probability . The capacity is bits per channel use, where is the binary entropy. For any rate and , there exists a code of rate and blocklength (sufficiently large) such that the maximum probability of error is at most .
Proof sketch (probabilistic method): Choose a random codebook of rate . The expected probability of error over the random ensemble satisfies Since the expected error is small, there exists a codebook achieving this error probability.
A random code is almost as good as the best possible code when is large. The probabilistic method guarantees the existence of good codes without telling us which specific codebook is good. Finding efficient constructive codes that match Shannon's bound (polar codes, LDPC codes) took another 60 years.
If , can for every codebook?
By the probabilistic method, at least half of all codebooks achieve .
Random codebook
Draw codewords i.i.d.\ uniformly from . Transmit codeword over the BSC. The received sequence is , where has i.i.d.\ components.
Decoding rule
Use typical-set decoding: declare if is the unique codeword with normalized Hamming distance to within for some small .
Error probability bound via union bound
An error occurs if the correct codeword is not typical with (probability decays exponentially with by the law of large numbers) or some incorrect codeword is typical with . By the union bound over wrong codewords:
Expectation over random codebook
Taking , the individual codeword probabilities become since the wrong codewords are independent of the channel noise. With : Choosing makes both terms decay exponentially to 0. By the probabilistic method, a good codebook exists.
Connection to Book ITA
The argument above is a sketch of Shannon's 1948 random coding proof. Book ITA (Information Theory and Applications) develops this rigorously: it defines typical sequences and their counting properties (method of types), states and proves the channel coding theorem in full generality for memoryless channels, and analyzes the error exponent (the rate at which as ). The key insight here is the probabilistic method: the existence of a good code follows from the existence of a good average code, without constructing it.
Shannon's Random Coding: Expected Error vs. Rate and Blocklength
The expected probability of error for a random BSC codebook as a function of rate and blocklength . Drag the rate slider: for the error decays to zero; for it approaches 1. The phase transition at is the capacity threshold.
Parameters
Example: Tournament Scheduling via the Probabilistic Method
A tournament on players is a complete directed graph: every pair of players has exactly one directed edge (the winner of their match). A Hamiltonian path in a tournament is an ordering of all players such that each player beats the next. Prove that every tournament contains a Hamiltonian path (a dominant ranking exists).
Random permutation
Choose a uniformly random permutation of the players. Let be the number of violations: pairs of consecutive players where beats .
Expected violations
For each consecutive pair , the probability that beats is (by symmetry of the random permutation). There are pairs, so
Existence of Hamiltonian path
Since , there exists a permutation with strictly fewer than violations, i.e., . But wait — we need . We proceed differently: sort the players by out-degree (number of wins). Show this ordering has no violations. (Alternative: the greedy construction always finds a Hamiltonian path in a tournament.)
Remark: The probabilistic argument shows the average permutation has violations. With a small modification — counting orderings with fewer violations — one shows that good orderings must exist.
Common Mistake: The Probabilistic Method Is Not Constructive
Mistake:
After proving that a good codebook exists via the probabilistic method, one might think we now have a practical coding scheme. In fact, we only know the codebook exists — we have no efficient algorithm to find it.
Correction:
The probabilistic method provides an existence guarantee only. Practical coding schemes (turbo codes, LDPC codes, polar codes) are constructed using algebraic or iterative methods and proven separately to achieve capacity. Book ITA covers polar codes (2009 Erdal Arıkan), which are the first capacity-achieving codes with an efficient construction and polynomial-time encoder/decoder.
Quick Check
The probabilistic method proves existence of a good object by showing:
The expected value of some quality measure is below a threshold
The probability of a good object existing equals 1
An explicit construction algorithm succeeds with high probability
The worst-case object has quality below the threshold
If , then cannot be everywhere, so at least one good object exists. This is the core of the method.
Why We Do Not Use Random Codes in Practice
Shannon's random coding argument proves that good codes exist, but a uniformly random codebook of rate and blocklength has codewords each of length bits — the codebook storage alone requires bits. At rate and (a modest blocklength for 5G NR), storing the codebook would require bits bits — far beyond any imaginable storage. Practical codes (polar, LDPC, convolutional) have algebraic structure so the encoder and decoder require only operations and storage.
- •
Polar codes (3GPP NR control channels) achieve capacity with encoder and decoder complexity
- •
LDPC codes (3GPP NR data channels) are capacity-approaching with sparse parity-check matrices, decoder
- •
5G NR uses polar codes for PBCH, PDCCH; LDPC for PDSCH/PUSCH (TS 38.212)