Ferkans — Interactive Telecom Tutor

Existence Proofs Without Explicit Construction

The probabilistic method is one of the most powerful techniques in combinatorics. The idea is disarmingly simple: to prove that a combinatorial object with some property exists, we define a random experiment over all objects and show that the expected value of some quantity makes it impossible for every object to violate the property.

The argument never constructs the object — it merely proves one must exist. This is existence proof via probability. Paul Erdős pioneered this method in the 1940s and 1950s, and Shannon used it in 1948 to prove the channel coding theorem — perhaps its most consequential application.

Theorem: The Probabilistic Method

Let $\mathcal{X}$ be a finite set and $X$ a random variable uniformly distributed over $\mathcal{X}$ . If $\mathbb{E}[f(X)] < t$ for some function $f: \mathcal{X} \to \mathbb{R}$ , then there exists at least one $x \in \mathcal{X}$ such that $f(x) < t$ .

Equivalently: if $\mathbb{P}(f(X) \geq t) < 1$ , then $\mathbb{P}(f(X) < t) > 0$ , and in particular some $x$ satisfies $f(x) < t$ .

The average value of $f$ is below $t$ ; therefore $f$ cannot exceed $t$ everywhere — at least one point must be below the average. This trivial observation, applied to the right random experiment, yields non-trivial existence results.

Show Hint

If $f(x) \geq t$ for every $x \in \mathcal{X}$ , what is the minimum possible value of $\mathbb{E}[f(X)]$ ?

Contrapose: assume every $x$ has $f(x) \geq t$ . Then $\mathbb{E}[f(X)] \geq t$ , contradicting $\mathbb{E}[f(X)] < t$ .

Proof

Proof by contradiction

Suppose for contradiction that $f(x) \geq t$ for every $x \in \mathcal{X}$ . Then $\mathbb{E}[f(X)] = \frac{1}{|\mathcal{X}|} \sum_{x \in \mathcal{X}} f(x) \;\geq\; t.$ This contradicts $\mathbb{E}[f(X)] < t$ . Therefore there exists $x^* \in \mathcal{X}$ with $f(x^*) < t$ . $\blacksquare$

Key Takeaway

The probabilistic method converts a lower bound on $\mathbb{E}[f(X)]$ into an existence guarantee. The random experiment is a proof device, not a description of how to find the good object.

Historical Note: Erdős and the Birth of the Probabilistic Method

1947–present

Paul Erdős (1913–1996) introduced the probabilistic method in a 1947 paper showing that Ramsey numbers $R(k,k) > \sqrt{2}\cdot k \cdot 2^{k/2}$ — meaning that large graphs can simultaneously avoid both a clique of size $k$ and an independent set of size $k$ , which seems impossible to construct but is easy to prove via a random coloring argument. Erdős had no explicit construction; he only showed that a random coloring succeeds with positive probability. Over the following decades, he and collaborators applied this technique to dozens of seemingly intractable combinatorial problems. The method has since become foundational in coding theory, cryptography, and algorithm design.

Definition:
Random Codebook

A binary random codebook $\mathcal{C}$ of rate $R$ and blocklength $n$ is generated by independently choosing $M = 2^{nR}$ codewords $\mathbf{c}_1, \ldots, \mathbf{c}_M \in \{0,1\}^n$ , each component drawn i.i.d.\ from $\text{Bernoulli}(1/2)$ .

The codebook $\mathcal{C}$ is a random object. A fixed codebook is any specific realization of this process. Shannon's argument shows that the expected error probability over the random ensemble is small whenever $R < C$ (the channel capacity), proving a good codebook exists.

This definition will be revisited and made precise in Book ITA. Here we emphasize the probabilistic existence argument only.

Theorem: Shannon's Random Coding Existence Argument (Preview)

Consider a binary symmetric channel (BSC) with crossover probability $p < 1/2$ . The capacity is $C = 1 - H_b(p)$ bits per channel use, where $H_b(p) = -p\log_2 p - (1-p)\log_2(1-p)$ is the binary entropy. For any rate $R < C$ and $\epsilon > 0$ , there exists a code of rate $R$ and blocklength $n$ (sufficiently large) such that the maximum probability of error is at most $\epsilon$ .

Proof sketch (probabilistic method): Choose a random codebook of rate $R$ . The expected probability of error over the random ensemble satisfies $\mathbb{E}_{\mathcal{C}}[P_e(\mathcal{C})] \;\leq\; 2^{-n(C-R)/2} \;\xrightarrow{n \to \infty}\; 0.$ Since the expected error is small, there exists a codebook achieving this error probability. $\blacksquare$

A random code is almost as good as the best possible code when $n$ is large. The probabilistic method guarantees the existence of good codes without telling us which specific codebook is good. Finding efficient constructive codes that match Shannon's bound (polar codes, LDPC codes) took another 60 years.

Show Hint

If $\mathbb{E}_{\mathcal{C}}[P_e] \leq \delta$ , can $P_e(\mathcal{C}) > \delta$ for every codebook?

By the probabilistic method, at least half of all codebooks achieve $P_e \leq 2\delta$ .

Proof

Random codebook

Draw $M = 2^{nR}$ codewords i.i.d.\ uniformly from $\{0,1\}^n$ . Transmit codeword $\mathbf{c}_m$ over the BSC. The received sequence is $\mathbf{y} = \mathbf{c}_m \oplus \mathbf{e}$ , where $\mathbf{e} \in \{0,1\}^n$ has i.i.d.\ $\text{Bernoulli}(p)$ components.

Decoding rule

Use typical-set decoding: declare $\hat{m} = m'$ if $\mathbf{c}_{m'}$ is the unique codeword with normalized Hamming distance to $\mathbf{y}$ within $[p - \delta, p + \delta]$ for some small $\delta > 0$ .

Error probability bound via union bound

An error occurs if the correct codeword is not typical with $\mathbf{y}$ (probability decays exponentially with $n$ by the law of large numbers) or some incorrect codeword $\mathbf{c}_{m'}, m' \neq m$ is typical with $\mathbf{y}$ . By the union bound over $M-1$ wrong codewords: $P_e \leq \underbrace{P(\text{correct atypical})}_{\leq 2^{-n\delta^2/p}} + (M-1) \cdot \underbrace{P(\mathbf{c}_{m'} \text{ typical} \mid \mathbf{c}_m)}_{\leq 2^{-n(1-H_b(p)-2\delta)}}.$

Expectation over random codebook

Taking $\mathbb{E}_{\mathcal{C}}[\cdot]$ , the individual codeword probabilities become $2^{-n(1-H_b(p)-2\delta)}$ since the wrong codewords are independent of the channel noise. With $M = 2^{nR}$ : $\mathbb{E}_{\mathcal{C}}[P_e] \leq 2^{-n\delta^2/p} + 2^{nR} \cdot 2^{-n(C - 2\delta)} = 2^{-n\delta^2/p} + 2^{-n(C-R-2\delta)}.$ Choosing $\delta < (C-R)/2$ makes both terms decay exponentially to 0. By the probabilistic method, a good codebook exists. $\blacksquare$

,

Connection to Book ITA

The argument above is a sketch of Shannon's 1948 random coding proof. Book ITA (Information Theory and Applications) develops this rigorously: it defines typical sequences and their counting properties (method of types), states and proves the channel coding theorem in full generality for memoryless channels, and analyzes the error exponent (the rate at which $P_e \to 0$ as $n \to \infty$ ). The key insight here is the probabilistic method: the existence of a good code follows from the existence of a good average code, without constructing it.

Shannon's Random Coding: Expected Error vs. Rate and Blocklength

The expected probability of error $\mathbb{E}[P_e]$ for a random BSC codebook as a function of rate $R$ and blocklength $n$ . Drag the rate slider: for $R < C$ the error decays to zero; for $R > C$ it approaches 1. The phase transition at $R = C$ is the capacity threshold.

Parameters

BSC crossover probability

p

0.1

Maximum blocklength

n

500

Example: Tournament Scheduling via the Probabilistic Method

A tournament on $n$ players is a complete directed graph: every pair of players has exactly one directed edge (the winner of their match). A Hamiltonian path in a tournament is an ordering of all players such that each player beats the next. Prove that every tournament contains a Hamiltonian path (a dominant ranking exists).

Solution

Random permutation

Choose a uniformly random permutation $\pi$ of the $n$ players. Let $X$ be the number of violations: pairs of consecutive players $(\pi(i), \pi(i+1))$ where $\pi(i+1)$ beats $\pi(i)$ .

Expected violations

For each consecutive pair $(\pi(i), \pi(i+1))$ , the probability that $\pi(i+1)$ beats $\pi(i)$ is $1/2$ (by symmetry of the random permutation). There are $n-1$ pairs, so $\mathbb{E}[X] = (n-1)/2.$

Existence of Hamiltonian path

Since $\mathbb{E}[X] = (n-1)/2 < n-1$ , there exists a permutation $\pi^*$ with strictly fewer than $n-1$ violations, i.e., $X(\pi^*) < n-1$ . But wait — we need $X = 0$ . We proceed differently: sort the players by out-degree (number of wins). Show this ordering has no violations. (Alternative: the greedy construction always finds a Hamiltonian path in a tournament.) $\blacksquare$

Remark: The probabilistic argument shows the average permutation has $(n-1)/2$ violations. With a small modification — counting orderings with fewer violations — one shows that good orderings must exist.

Common Mistake: The Probabilistic Method Is Not Constructive

Mistake:

After proving that a good codebook exists via the probabilistic method, one might think we now have a practical coding scheme. In fact, we only know the codebook exists — we have no efficient algorithm to find it.

Correction:

The probabilistic method provides an existence guarantee only. Practical coding schemes (turbo codes, LDPC codes, polar codes) are constructed using algebraic or iterative methods and proven separately to achieve capacity. Book ITA covers polar codes (2009 Erdal Arıkan), which are the first capacity-achieving codes with an efficient construction and polynomial-time encoder/decoder.

Quick Check

The probabilistic method proves existence of a good object by showing:

The expected value of some quality measure is below a threshold

The probability of a good object existing equals 1

An explicit construction algorithm succeeds with high probability

The worst-case object has quality below the threshold

Correction:

The expected value of some quality measure is below a threshold

If $\mathbb{E}[f(X)] < t$ , then $f$ cannot be $\geq t$ everywhere, so at least one good object exists. This is the core of the method.

🔧Engineering Note

Why We Do Not Use Random Codes in Practice

Shannon's random coding argument proves that good codes exist, but a uniformly random codebook of rate $R$ and blocklength $n$ has $2^{nR}$ codewords each of length $n$ bits — the codebook storage alone requires $2^{nR} \cdot n$ bits. At rate $R = 1/2$ and $n = 128$ (a modest blocklength for 5G NR), storing the codebook would require $2^{64} \cdot 128$ bits $\approx 10^{21}$ bits — far beyond any imaginable storage. Practical codes (polar, LDPC, convolutional) have algebraic structure so the encoder and decoder require only $O(n \log n)$ operations and $O(n)$ storage.

Practical Constraints

•
Polar codes (3GPP NR control channels) achieve capacity with $O(n \log n)$ encoder and decoder complexity
•
LDPC codes (3GPP NR data channels) are capacity-approaching with sparse parity-check matrices, $O(n)$ decoder
•
5G NR uses polar codes for PBCH, PDCCH; LDPC for PDSCH/PUSCH (TS 38.212)

📋 Ref: 3GPP TS 38.212

The Probabilistic Method

Existence Proofs Without Explicit Construction

Theorem: The Probabilistic Method

Proof by contradiction

Key Takeaway

Historical Note: Erdős and the Birth of the Probabilistic Method

Definition: Random Codebook

Theorem: Shannon's Random Coding Existence Argument (Preview)

Random codebook

Decoding rule

Error probability bound via union bound

Expectation over random codebook

Connection to Book ITA

Shannon's Random Coding: Expected Error vs. Rate and Blocklength

Parameters

Example: Tournament Scheduling via the Probabilistic Method

Random permutation

Expected violations

Existence of Hamiltonian path

Common Mistake: The Probabilistic Method Is Not Constructive

Quick Check

Why We Do Not Use Random Codes in Practice

Definition:
Random Codebook