Chapter Summary

Chapter 9 Summary

Key Points

  • 1.

    The channel coding theorem establishes C=max⁑PXI(X;Y)C = \max_{P_X} I(X;Y) as the maximum reliable communication rate over a DMC. Rates below CC are achievable; rates above CC are not.

  • 2.

    The achievability proof uses random coding (i.i.d. codewords from the capacity-achieving distribution) and joint typicality decoding. The packing lemma bounds the probability of confusing wrong codewords, and expurgation converts average to maximal error.

  • 3.

    The converse uses Fano's inequality to convert Pe→0P_e \to 0 into an entropy bound, the chain rule to decompose I(M;Yn)I(M; Y^n) into single-letter terms, and the memoryless property to bound each term by CC. This 'Fano → chain rule → single-letter' pattern is the universal converse technique.

  • 4.

    BSC capacity is 1βˆ’H2(p)1 - \mathcal{H}_2(p), BEC capacity is 1βˆ’Ο΅1 - \epsilon, and additive noise channels over Fq\mathbb{F}_q have capacity log⁑qβˆ’H(Z)\log q - H(Z). For symmetric channels, the uniform input is optimal.

  • 5.

    Feedback does not increase DMC capacity because the memoryless property limits each use to at most CC bits regardless of encoding strategy. However, feedback can simplify coding and improve error exponents.

  • 6.

    The capacity-cost function C(B)=max⁑PX:E[b(X)]≀BI(X;Y)C(B) = \max_{P_X : \mathbb{E}[b(X)] \leq B} I(X;Y) handles input constraints. The Blahut-Arimoto algorithm computes capacity via alternating optimization.

Looking Ahead

Chapter 10 extends the channel coding theorem to the continuous-alphabet AWGN channel, where the capacity takes the celebrated form C=12log⁑(1+SNR)C = \frac{1}{2}\log(1 + \text{SNR}). The proof requires the maximum-entropy property of the Gaussian distribution and introduces the water-filling power allocation for parallel channels β€” a technique that becomes central in MIMO and OFDM systems.

Random coding

A proof technique where a codebook is generated by drawing codewords independently from a chosen distribution. The expected error probability over the random ensemble is analyzed, and existence of at least one good code follows. Random coding is the standard achievability tool in information theory.

Related: Channel coding theorem, The Packing Lemma, Joint typicality decoding

Expurgation

A technique for converting a code with small average error probability into one with small maximal error probability, by discarding the worst-performing codewords. Keeping the best half reduces the rate by 1/n1/n (negligible for large nn) while at most doubling the maximal error probability.

Related: Random coding, Average vs. maximal error

Capacity-cost function

The maximum achievable rate for a DMC subject to an input cost constraint: C(B)=max⁑PX:E[b(X)]≀BI(X;Y)C(B) = \max_{P_X : \mathbb{E}[b(X)] \leq B} I(X;Y). It is non-decreasing and concave in the cost budget BB.

Related: Channel capacity, Blahut-Arimoto algorithm, Ergodic Capacity with Full CSI (Water-Filling)

πŸŽ“CommIT Contribution(1948)

The Channel Coding Theorem

C. E. Shannon, G. Caire β€” Bell System Technical Journal

Shannon's 1948 paper established the channel coding theorem, proving that reliable communication at rates up to C=max⁑PXI(X;Y)C = \max_{P_X} I(X;Y) is achievable, and rates above capacity are not. The proof introduced random coding as a proof technique and laid the foundation for all of coding theory. The presentation in this chapter follows the proof structure used in Prof. Caire's ITA course at TU Berlin, which emphasizes the packing lemma and the reusable proof pattern of "random coding + Fano's inequality."

channel-coding-theoremshannoncapacity

Why This Matters: From DMC Capacity to Wireless System Design

The DMC channel coding theorem is the theoretical foundation for all wireless communication systems. The capacity formula C=max⁑PXI(X;Y)C = \max_{P_X} I(X;Y) generalizes to:

  • AWGN channels: C=12log⁑(1+SNR)C = \frac{1}{2}\log(1 + \text{SNR}) (Chapter 10)
  • Fading channels: C=E[log⁑(1+∣H∣2SNR)]C = \mathbb{E}[\log(1 + |H|^2\text{SNR})] (Chapter 13)
  • MIMO channels: C=log⁑det⁑(I+SNR HHH)C = \log\det(\mathbf{I} + \text{SNR}\,\mathbf{H}\mathbf{H}^H) (Book mimo)

The proof techniques β€” random coding for achievability, Fano for the converse β€” carry over directly, with the packing and covering lemmas adapted for continuous alphabets. See Book telecom, Ch. 14-15 for the capacity of fading and MIMO channels, and the practical coding schemes that approach these limits.