The Broadcast Channel Model
From Uplink to Downlink
In Chapter 14, we studied the multiple access channel β the uplink scenario where multiple transmitters communicate with a single receiver. Now we turn to the dual problem: the broadcast channel (BC), modeling the downlink where a single transmitter sends independent messages to multiple receivers.
Intuitively, the BC should be the "easy" direction: the transmitter knows all messages and can design the transmitted signal jointly. Surprisingly, the BC is harder than the MAC from an information-theoretic standpoint. The capacity region for the general (non-degraded) BC remains unknown to this day! The key difficulty is that each receiver sees a different channel output, and the transmitter must simultaneously serve receivers with different channel qualities.
The breakthrough insight, due to Cover (1972) and Bergmans (1973), is that for degraded broadcast channels β where one receiver's output is a degraded version of the other's β the capacity region can be completely characterized. The achievability technique is superposition coding, one of the most elegant ideas in information theory.
Definition: Discrete Memoryless Broadcast Channel (DM-BC)
Discrete Memoryless Broadcast Channel (DM-BC)
A two-user discrete memoryless broadcast channel (DM-BC) consists of:
- A finite input alphabet ,
- Two finite output alphabets and ,
- A conditional PMF for each .
The channel is memoryless: when used times,
The transmitter has two independent messages: uniform over for receiver 1, and uniform over for receiver 2. The encoder maps the pair to a codeword .
Receiver observes and produces an estimate . A rate pair is achievable if there exists a sequence of codes such that
as . The capacity region is the closure of all achievable rate pairs.
Notice the asymmetry with the MAC: in the MAC, two independent encoders send to one decoder. In the BC, one encoder sends to two independent decoders. The encoder sees both messages, but each decoder sees only its own channel output.
Broadcast Channel (BC)
A multiuser channel model with one transmitter sending independent messages to two or more receivers. Each receiver observes a (potentially different) noisy version of the transmitted signal and must decode its own message.
Related: Multiple Access Channel (MAC), Degraded Broadcast Channel
Historical Note: Cover's 1972 Broadcast Channel Paper
1970sThomas Cover introduced the broadcast channel in his seminal 1972 paper "Broadcast channels," which appeared in the IEEE Transactions on Information Theory. Cover posed the problem of determining the capacity region and proposed superposition coding as a candidate achievability scheme. He showed that for degraded channels, superposition coding achieves the capacity region, and conjectured that this could be extended more broadly.
What makes Cover's contribution remarkable is the conceptual leap: rather than treating interference as noise (the naive approach), he proposed layering the information intended for different users at different "resolutions." This idea β encoding the weak user's message as a coarse description and the strong user's message as a refinement β became the foundation for all subsequent work on broadcast channels.
The full converse for the degraded Gaussian case required the entropy power inequality, established by Bergmans (1973). The general (non-degraded) BC capacity region remains one of the great open problems in information theory.
Definition: Physically and Stochastically Degraded Broadcast Channel
Physically and Stochastically Degraded Broadcast Channel
A broadcast channel is physically degraded if
forms a Markov chain, i.e., . This means receiver 2's output is obtained by passing receiver 1's output through an additional noisy channel. Receiver 1 is the strong user and receiver 2 is the weak user.
More generally, a broadcast channel is stochastically degraded if there exists a channel such that
for all , i.e., the marginal can be factored through even if the physical channel does not have this structure.
The capacity region of a stochastically degraded BC equals that of the corresponding physically degraded BC with marginals and .
The distinction between physical and stochastic degradedness is important: many channels that are not physically degraded are stochastically degraded. For the capacity region, only the marginals and matter, not the joint .
Degraded Broadcast Channel
A broadcast channel where forms a Markov chain, meaning receiver 2's observation is a noisier version of receiver 1's. The capacity region is completely known for degraded BCs via superposition coding.
Related: Broadcast Channel (BC), Superposition Coding
What Degradedness Means Operationally
The Markov chain has a powerful operational implication: anything that receiver 2 can compute from , receiver 1 can also compute from (at least in the information-theoretic sense).
Intuitively, if we think of as a high-resolution photograph and as a blurry copy obtained by further degrading , then any feature visible in the blurry copy is certainly visible in the sharp original. This means the strong user can always decode the weak user's message β a property that superposition coding exploits.
Example: Degraded Binary Symmetric Broadcast Channel
Consider a broadcast channel where , , and , where and are independent. Show that this is a degraded BC and find the crossover probabilities for both users.
Verify the Markov chain
By construction, depends on only through , so is a Markov chain. This is a physically degraded broadcast channel.
Crossover probability for user 1
User 1 sees a BSC with crossover probability :
Crossover probability for user 2
User 2 sees . Since and are independent Bernoulli, is Bernoulli with parameter
where denotes binary convolution (not multiplication). Since , we have , confirming that user 2 has a worse channel.
Interpretation
User 1's channel is a ; user 2's channel is a . Since (for ), user 2 is indeed the weaker user. The capacity of user 1's channel exceeds that of user 2's: .
Example: The Gaussian Broadcast Channel is Stochastically Degraded
Consider the Gaussian BC: and , where and are independent with . Show that this channel is stochastically degraded.
Construct the degrading channel
We need to find a channel such that has the same marginal distribution as given .
Let where , independent of everything else. Then:
Verify the marginal
Since and are independent Gaussians, .
So , which matches the distribution of . The channel is stochastically degraded.
Note on physical degradation
The original channel is not physically degraded because and are independent β is not obtained by adding noise to . However, the stochastically degraded model has the same marginals, so the capacity regions are identical.
This is a general and important principle: for capacity analysis, we can always replace a stochastically degraded BC with an equivalent physically degraded one.
Quick Check
Consider a broadcast channel where and with , , and , all independent. Under what condition is this channel stochastically degraded with user 1 stronger?
always
, i.e.,
The channel is always degraded regardless of parameters
User 1 is stronger when , i.e., , which gives . The channel can be reduced to the standard form by normalizing user 2's output.
Common Mistake: Not All Broadcast Channels Are Degraded
Mistake:
Assuming that because one receiver has higher noise variance, the broadcast channel must be degraded.
Correction:
Degradedness requires the Markov chain , which means that user 2's observation can be simulated from user 1's. This holds for the Gaussian BC (stochastically) and many other channels, but not in general. For example, consider a channel where and for a non-invertible function . Even if has larger variance, user 2 might observe a different "projection" of that cannot be obtained from .
The general (non-degraded) BC capacity region is unknown and requires more sophisticated techniques such as Marton coding.
Definition: Broadcast Channel Code
Broadcast Channel Code
An broadcast code for the DM-BC consists of:
-
Encoder: A mapping ,
-
Decoder 1: A mapping ,
-
Decoder 2: A mapping .
The average probability of error is
where is uniform over the message set.
Naive Approaches and Their Limitations
Before developing superposition coding, let us consider two naive strategies and see why they are suboptimal:
Time-sharing (TDMA): Allocate a fraction of the time to user 1 and to user 2. Each user's rate is limited by the capacity of its own channel scaled by its time fraction: and . This traces a straight line between and .
Treating interference as noise: Encode for user 2 at rate , then independently encode for user 1. User 2 treats user 1's signal as additional noise. User 1 treats user 2's signal as noise. This ignores the structure of the interference.
We will see that superposition coding achieves a rate region that is strictly larger than time-sharing whenever the users have different channel qualities. The secret is that the strong user can decode and subtract the weak user's message.
Superposition Coding
An encoding strategy for the broadcast channel where the weak user's message is encoded as a "cloud center" (coarse layer) and the strong user's message is encoded as a "satellite" (fine layer) around the cloud center. The weak user decodes only the cloud center; the strong user decodes both layers.
Cloud-Satellite Structure
The geometric interpretation of superposition coding. The codebook is organized into "clouds" (one per weak-user message) with "satellites" (one per strong-user message) within each cloud. Each cloud center represents a coarse codeword for the weak user; each satellite represents the fine detail for the strong user.
Related: Superposition Coding
Why This Matters: The Broadcast Channel as the Cellular Downlink
The broadcast channel directly models the cellular downlink: the base station (transmitter) sends independent data streams to multiple mobile users (receivers). In a typical cell, users are at different distances from the base station and therefore experience different path losses, making the channel approximately degraded.
Superposition coding β under the name NOMA (Non-Orthogonal Multiple Access) β has been studied extensively for 5G. The base station allocates more power to the cell-edge (weak) user and less power to the cell-center (strong) user. The strong user performs successive interference cancellation (SIC) to first decode and subtract the weak user's message, then decode its own.
While practical NOMA implementations face challenges (imperfect SIC, channel estimation errors, user pairing), the information-theoretic superiority of superposition coding over orthogonal access (OFDMA) is clear: superposition coding achieves the capacity region, while OFDMA achieves only a subset.
See full treatment in Chapter 16
Key Takeaway
The broadcast channel (one-to-many) and the MAC (many-to-one) are dual problems, but their information-theoretic analyses are fundamentally different. The MAC capacity region is known for general channels; the BC capacity region is known only for degraded (and some special) channels. The degraded BC capacity region is achieved by superposition coding β an encoding technique with no MAC analogue.
Broadcast Channel: Degraded vs. Non-Degraded
Visualize the broadcast channel model. In the degraded case, user 2's output is obtained by adding more noise to user 1's output. Compare the mutual information quantities for different noise levels.