The Channel Coding Theorem — Achievability

The Random Coding Argument

We now prove that any rate R<CR < C is achievable. The proof is one of the most elegant arguments in mathematics: we show that a randomly generated codebook, with high probability, achieves vanishing error probability. No clever construction is needed — randomness itself provides enough structure.

The argument has three steps: (1) generate the codebook randomly, (2) decode by joint typicality, (3) show that the error probability vanishes for rates below I(X;Y)I(X; Y). The packing lemma from Chapter 3 does the heavy lifting in step 3.

Random Coding and the Packing Argument

A visual walk through the random coding proof: codewords are scattered in sequence space, the decoder draws a "typical shell" around the received output, and we check whether any wrong codeword falls within the shell. For rates below capacity, the shell contains only the correct codeword.

Theorem: Channel Coding Theorem — Achievability

For any R<C=maxPXI(X;Y)R < C = \max_{P_X} I(X; Y) and any ϵ>0\epsilon > 0, there exists an (R,n)(R, n)-code Cn\mathcal{C}_n with Pe,max(Cn)ϵP_{e,\max}(\mathcal{C}_n) \leq \epsilon for all sufficiently large nn.

Think of the codebook as 2nR2^{nR} points scattered in Xn\mathcal{X}^n. When codeword xn(m)x^n(m) is transmitted, the output yny^n is "near" it in the sense of joint typicality. As long as no other codeword xn(m)x^n(m') is also near yny^n (also jointly typical), the decoder succeeds. The packing lemma guarantees this for R<I(X;Y)R < I(X; Y): there is enough "room" in Yn\mathcal{Y}^n for 2nI(X;Y)2^{nI(X;Y)} non-confusable codewords.

, ,

The Power of Random Coding

The random coding argument is sometimes called the "greatest trick in information theory." We never construct a specific good code — we show that a randomly drawn code works with high probability. This is profoundly useful for proving existence but does not directly yield practical codes.

The challenge of coding theory for the next 70 years has been to find explicit codes that approach capacity: turbo codes (1993), LDPC codes (rediscovered 1996), and polar codes (2009) each represent major steps toward this goal. See Book telecom, Ch. 12 and Chapter 11 of this book for these constructions.

Common Mistake: Random Coding is Not a Code Construction

Mistake:

Treating the random coding argument as a practical code design method. In practice, one cannot generate a random codebook of size 2nR2^{nR} for reasonable block lengths — the number of codewords grows exponentially.

Correction:

Random coding is a proof technique that establishes the existence of good codes. Practical codes use structure (algebraic, graphical, or recursive) to enable efficient encoding and decoding. The gap between the random coding bound and practical code performance has been nearly closed by modern codes (LDPC, polar).

Historical Note: Shannon's 1948 Proof

1948

Shannon's original 1948 paper introduced the random coding argument, though in a somewhat different form than presented here. The cleanest version of the proof, using joint typicality decoding, was developed later by Wolfowitz (1961) and refined by Csiszar and Korner (1981). The Orlitsky-Roche strong typicality framework, which we follow in this book, provides the most versatile version of the argument.

It is worth pausing to appreciate the audacity of Shannon's approach: he proved that reliable communication is possible by showing that a code chosen at random works — without ever exhibiting a specific code. This probabilistic method, borrowed from combinatorics (Erdos), was radical in the engineering context of 1948.

Quick Check

In the achievability proof, which step relies on the independence of the wrong codewords from the channel output?

Bounding Pr(E1)\Pr(\mathcal{E}_1): the true pair must be jointly typical

Bounding Pr(E2)\Pr(\mathcal{E}_2) via the packing lemma: wrong codewords are independent of YnY^n

The expurgation step: removing bad codewords

Key Takeaway

The achievability proof shows that rates below CC are achievable via random coding and joint typicality decoding. The key ingredients are: (1) the AEP guarantees the true pair is jointly typical, (2) the packing lemma bounds the probability of confusion with wrong codewords, and (3) expurgation converts average error to maximal error. This proof pattern — random coding for achievability — will be reused throughout the book.