Feedback, Source-Channel Coding, and Capacity Computation
Three Extensions: Feedback, Source-Channel Coding, and Capacity Computation
This section covers three important extensions of the channel coding theorem. First, we prove the perhaps surprising result that feedback does not increase the capacity of a DMC β even though the encoder can adapt its transmission based on what the decoder has already received. Second, we state the source-channel coding theorem, connecting source coding and channel coding via the separation principle (already proved in Chapter 7, Section 4, for the general case). Third, we introduce the capacity-cost function and the Blahut-Arimoto algorithm for computing capacity numerically.
Definition: Channel Coding with Feedback
Channel Coding with Feedback
An -code with feedback for a DMC consists of:
- A message
- A sequence of encoding functions: so that
- A decoding function
The key difference from the no-feedback case: the encoder can use past channel outputs to choose the current input . The channel output is fed back to the encoder with one time step of delay.
Theorem: Feedback Does Not Increase DMC Capacity
The feedback capacity of a discrete memoryless channel equals the non-feedback capacity:
The converse proof from Section 4 already covers the feedback case! The key step holds because the channel is memoryless: depends on only through , regardless of whether was chosen using feedback or not.
The achievability is trivial: any code without feedback is also valid with feedback (just ignore the feedback). So the achievable rates are at least as large as without feedback.
The point is that feedback cannot help because the channel is memoryless. The output carries at most bits of new information per use, and no encoding trick can change this. What feedback can do is simplify coding and improve error exponents β but not capacity.
Converse (same as without feedback)
The converse proof from Section 4 applies verbatim. The only step that could potentially change is the bound .
With feedback, is a function of both and . But the memoryless property still gives:
So: .
The rest of the converse follows identically: .
Achievability
Any code without feedback works with feedback (ignore the feedback signal). Since rates up to are achievable without feedback, they are also achievable with feedback.
When Feedback Does Help
Although feedback does not increase DMC capacity, it provides significant practical benefits:
-
Simplified coding: The Schalkwijk-Kailath scheme for the Gaussian channel with feedback achieves capacity using a simple linear strategy β no codebook search, no joint typicality decoding.
-
Improved error exponents: With feedback, the error probability can decay doubly exponentially in the block length (), compared to single exponential () without feedback. Burnashev's exponent gives the optimal error exponent with feedback.
-
ARQ protocols: On the BEC, feedback enables simple retransmission of erased packets, achieving capacity with minimal coding complexity.
-
Channels with memory: For channels with memory (ISI channels, fading channels), feedback can increase capacity. The DMC result does not generalize.
-
Multiuser channels: For MACs, broadcast channels, and interference channels, feedback can strictly increase the capacity region.
Historical Note: The Schalkwijk-Kailath Scheme
1966In 1966, Schalkwijk and Kailath devised a remarkably simple feedback scheme for the Gaussian channel. The encoder sends a refinement of its estimate of the decoder's error at each step, using the feedback to track the decoder's state. After steps, the error probability decays as β doubly exponential in the block length. This is astronomically faster than the single-exponential decay without feedback.
The scheme works because feedback allows the encoder to "zoom in" on the message: after the first transmission, the encoder knows what the decoder received and can send a correction. After the second transmission, it corrects the correction, and so on β each step squares the effective SNR.
Definition: The Capacity-Cost Function
The Capacity-Cost Function
Many channels have input cost constraints. A cost function assigns a cost to each input symbol, with the per-codeword cost .
The capacity-cost function is the supremum of achievable rates for codes satisfying for all messages :
Properties:
- is non-decreasing in (more budget = more capacity)
- is concave in (diminishing returns)
- where makes the constraint inactive
The capacity-cost function is the channel coding analogue of the rate-distortion function: both are optimizations over input/test-channel distributions subject to an expected-value constraint. The water-filling solution for parallel Gaussian channels (Chapter 10) is a special case where the cost is power.
Example: BSC with Hamming Weight Constraint
The BSC() with Hamming weight constraint: , budget (at most fraction of input bits can be 1). Find .
Compute output distribution
Let with . Then:
Since is concave and increasing on , is maximized by taking as large as possible.
Capacity-cost function
\delta \geq 1/2$, the constraint is inactive and we recover the unconstrained BSC capacity.
Definition: The Blahut-Arimoto Algorithm
The Blahut-Arimoto Algorithm
The Blahut-Arimoto algorithm computes by alternating maximization. Define:
where is an auxiliary backward channel matrix.
Iteration (starting from initial , with fixed Lagrange multiplier ):
-
E-step: Update backward channel:
-
M-step: Update input distribution:
Convergence: as , where .
The Blahut-Arimoto algorithm converges to the global maximum because: (1) is concave in for fixed , (2) is concave in for fixed , and (3) the constraint sets are convex and compact. Alternating maximization under these conditions provably converges.
The same algorithmic structure computes the rate-distortion function β simply swap the roles of encoder and decoder. The Blahut-Arimoto algorithm is a special case of the EM algorithm, predating it by several years.
Blahut-Arimoto algorithm
An iterative algorithm for computing channel capacity (or the rate-distortion function) by alternating between optimizing the input distribution and the backward channel. Converges to the global optimum due to the concavity of mutual information.
Related: Channel capacity, Rate-Distortion Function, EM algorithm
Blahut-Arimoto Algorithm for Channel Capacity
Complexity: O(|X| * |Y|) per iteration, typically converges in O(1/eps) iterationsFor the unconstrained case (no cost), set lambda = 0 and the M-step simplifies. For the constrained case, lambda is a parameter that traces out the capacity-cost curve C(B) parametrically: each lambda gives a (B, C(B)) pair.
Blahut-Arimoto Algorithm Convergence
Watch the Blahut-Arimoto algorithm converge to the capacity-achieving input distribution for different DMC channels. The plot shows the input distribution and the mutual information at each iteration.
Parameters
0=BSC, 1=BEC, 2=Z-channel, 3=Custom 4x4
Crossover/erasure probability
Maximum number of Blahut-Arimoto iterations
Capacity-Cost Function
Explore the capacity-cost function C(B) for the BSC with Hamming weight constraint. Observe the concavity and the saturation at the unconstrained capacity.
Parameters
Channel noise parameter
Quick Check
Why does feedback not increase the capacity of a DMC?
Because the encoder already knows the message and does not need the channel output
Because the memoryless property means carries at most bits of new information regardless of how was chosen
Because feedback introduces a cycle that reduces the effective rate
Correct! The converse bound holds whether or not uses feedback. The channel's memoryless property is the key: each use conveys at most bits, period.
ARQ: Feedback in Practice
Automatic Repeat reQuest (ARQ) is the most common use of feedback in practical communication systems. On the BEC, the encoder simply retransmits erased packets. On general channels, Hybrid ARQ (HARQ) combines error correction with retransmission: the decoder attempts to decode and requests retransmission only if decoding fails.
While ARQ does not increase the capacity of a DMC, it achieves capacity with much simpler codes. Without ARQ, achieving near-capacity performance requires long block lengths and powerful codes (LDPC, polar). With ARQ, shorter codes and simpler decoders suffice because errors can be corrected by retransmission.
In 5G NR, HARQ with incremental redundancy is used on the downlink and uplink, with the feedback channel carrying ACK/NACK signals.
Computing Capacity in Practice
The Blahut-Arimoto algorithm is the standard tool for computing DMC capacity numerically. It converges geometrically (linear convergence rate) and typically requires 20-50 iterations for 6 digits of accuracy.
For continuous-alphabet channels (Gaussian, fading), the optimization is infinite-dimensional and Blahut-Arimoto does not directly apply. Instead:
- For AWGN: the capacity has a closed-form expression (Chapter 10).
- For fading: capacity involves an expectation over the fading distribution.
- For MIMO: the water-filling solution provides an efficient computation.
Modern extensions include accelerated Blahut-Arimoto (using Nesterov momentum) and neural-network-based capacity estimation for channels without closed-form expressions.
Key Takeaway
Feedback does not increase DMC capacity () because the memoryless property limits each channel use to at most bits, regardless of how the input was chosen. The capacity-cost function handles input constraints and is concave in . The Blahut-Arimoto algorithm computes capacity by alternating optimization β a precursor to the EM algorithm.