The Broadcast Channel Model

From Uplink to Downlink

In Chapter 14, we studied the multiple access channel β€” the uplink scenario where multiple transmitters communicate with a single receiver. Now we turn to the dual problem: the broadcast channel (BC), modeling the downlink where a single transmitter sends independent messages to multiple receivers.

Intuitively, the BC should be the "easy" direction: the transmitter knows all messages and can design the transmitted signal jointly. Surprisingly, the BC is harder than the MAC from an information-theoretic standpoint. The capacity region for the general (non-degraded) BC remains unknown to this day! The key difficulty is that each receiver sees a different channel output, and the transmitter must simultaneously serve receivers with different channel qualities.

The breakthrough insight, due to Cover (1972) and Bergmans (1973), is that for degraded broadcast channels β€” where one receiver's output is a degraded version of the other's β€” the capacity region can be completely characterized. The achievability technique is superposition coding, one of the most elegant ideas in information theory.

Definition:

Discrete Memoryless Broadcast Channel (DM-BC)

A two-user discrete memoryless broadcast channel (DM-BC) consists of:

  • A finite input alphabet X\mathcal{X},
  • Two finite output alphabets Y1\mathcal{Y}_1 and Y2\mathcal{Y}_2,
  • A conditional PMF p(y1,y2∣x)p(y_1, y_2 | x) for each (x,y1,y2)∈XΓ—Y1Γ—Y2(x, y_1, y_2) \in \mathcal{X} \times \mathcal{Y}_1 \times \mathcal{Y}_2.

The channel is memoryless: when used nn times,

p(y1n,y2n∣xn)=∏i=1np(y1,i,y2,i∣xi).p(y_1^n, y_2^n | x^n) = \prod_{i=1}^{n} p(y_{1,i}, y_{2,i} | x_i).

The transmitter has two independent messages: W1W_1 uniform over {1,…,2nR1}\{1, \ldots, 2^{nR_{1}}\} for receiver 1, and W2W_2 uniform over {1,…,2nR2}\{1, \ldots, 2^{nR_{2}}\} for receiver 2. The encoder maps the pair (W1,W2)(W_1, W_2) to a codeword xn(W1,W2)∈Xnx^n(W_1, W_2) \in \mathcal{X}^n.

Receiver kk observes YknY_k^n and produces an estimate W^k\hat{W}_k. A rate pair (R1,R2)(R_{1}, R_{2}) is achievable if there exists a sequence of codes such that

max⁑{Pr⁑(W^1β‰ W1),Pr⁑(W^2β‰ W2)}β†’0\max\{\Pr(\hat{W}_1 \neq W_1), \Pr(\hat{W}_2 \neq W_2)\} \to 0

as nβ†’βˆžn \to \infty. The capacity region C\mathcal{C} is the closure of all achievable rate pairs.

Notice the asymmetry with the MAC: in the MAC, two independent encoders send to one decoder. In the BC, one encoder sends to two independent decoders. The encoder sees both messages, but each decoder sees only its own channel output.

Broadcast Channel (BC)

A multiuser channel model with one transmitter sending independent messages to two or more receivers. Each receiver observes a (potentially different) noisy version of the transmitted signal and must decode its own message.

Related: Multiple Access Channel (MAC), Degraded Broadcast Channel

Historical Note: Cover's 1972 Broadcast Channel Paper

1970s

Thomas Cover introduced the broadcast channel in his seminal 1972 paper "Broadcast channels," which appeared in the IEEE Transactions on Information Theory. Cover posed the problem of determining the capacity region and proposed superposition coding as a candidate achievability scheme. He showed that for degraded channels, superposition coding achieves the capacity region, and conjectured that this could be extended more broadly.

What makes Cover's contribution remarkable is the conceptual leap: rather than treating interference as noise (the naive approach), he proposed layering the information intended for different users at different "resolutions." This idea β€” encoding the weak user's message as a coarse description and the strong user's message as a refinement β€” became the foundation for all subsequent work on broadcast channels.

The full converse for the degraded Gaussian case required the entropy power inequality, established by Bergmans (1973). The general (non-degraded) BC capacity region remains one of the great open problems in information theory.

Definition:

Physically and Stochastically Degraded Broadcast Channel

A broadcast channel p(y1,y2∣x)p(y_1, y_2 | x) is physically degraded if

X→Y1→Y2X \to Y_1 \to Y_2

forms a Markov chain, i.e., p(y2∣x,y1)=p(y2∣y1)p(y_2 | x, y_1) = p(y_2 | y_1). This means receiver 2's output is obtained by passing receiver 1's output through an additional noisy channel. Receiver 1 is the strong user and receiver 2 is the weak user.

More generally, a broadcast channel is stochastically degraded if there exists a channel p(y~2∣y1)p(\tilde{y}_2 | y_1) such that

p(y~2∣x)=βˆ‘y1p(y1∣x)p(y~2∣y1)=p(y2∣x)p(\tilde{y}_2 | x) = \sum_{y_1} p(y_1 | x) p(\tilde{y}_2 | y_1) = p(y_2 | x)

for all xx, i.e., the marginal p(y2∣x)p(y_2 | x) can be factored through Y1Y_1 even if the physical channel does not have this structure.

The capacity region of a stochastically degraded BC equals that of the corresponding physically degraded BC with marginals p(y1∣x)p(y_1|x) and p(y2∣x)p(y_2|x).

The distinction between physical and stochastic degradedness is important: many channels that are not physically degraded are stochastically degraded. For the capacity region, only the marginals p(y1∣x)p(y_1|x) and p(y2∣x)p(y_2|x) matter, not the joint p(y1,y2∣x)p(y_1, y_2|x).

,

Degraded Broadcast Channel

A broadcast channel where X→Y1→Y2X \to Y_1 \to Y_2 forms a Markov chain, meaning receiver 2's observation is a noisier version of receiver 1's. The capacity region is completely known for degraded BCs via superposition coding.

Related: Broadcast Channel (BC), Superposition Coding

What Degradedness Means Operationally

The Markov chain X→Y1→Y2X \to Y_1 \to Y_2 has a powerful operational implication: anything that receiver 2 can compute from Y2nY_2^n, receiver 1 can also compute from Y1nY_1^n (at least in the information-theoretic sense).

Intuitively, if we think of Y1Y_1 as a high-resolution photograph and Y2Y_2 as a blurry copy obtained by further degrading Y1Y_1, then any feature visible in the blurry copy is certainly visible in the sharp original. This means the strong user can always decode the weak user's message β€” a property that superposition coding exploits.

Example: Degraded Binary Symmetric Broadcast Channel

Consider a broadcast channel where X∈{0,1}X \in \{0,1\}, Y1=XβŠ•Z1Y_1 = X \oplus Z_1, and Y2=Y1βŠ•Z3Y_2 = Y_1 \oplus Z_3, where Z1∼Bernoulli(p1)Z_1 \sim \text{Bernoulli}(p_1) and Z3∼Bernoulli(p3)Z_3 \sim \text{Bernoulli}(p_3) are independent. Show that this is a degraded BC and find the crossover probabilities for both users.

Example: The Gaussian Broadcast Channel is Stochastically Degraded

Consider the Gaussian BC: Y1=X+Z1Y_1 = X + Z_{1} and Y2=X+Z2Y_2 = X + Z_{2}, where Z1∼N(0,N1)Z_{1} \sim \mathcal{N}(0, N_1) and Z2∼N(0,N2)Z_{2} \sim \mathcal{N}(0, N_2) are independent with N1<N2N_1 < N_2. Show that this channel is stochastically degraded.

Quick Check

Consider a broadcast channel where Y1=X+Z1Y_1 = X + Z_1 and Y2=aX+Z2Y_2 = aX + Z_2 with a>0a > 0, Z1∼N(0,N1)Z_1 \sim \mathcal{N}(0, N_1), and Z2∼N(0,N2)Z_2 \sim \mathcal{N}(0, N_2), all independent. Under what condition is this channel stochastically degraded with user 1 stronger?

N1<N2N_1 < N_2 always

P/N1>a2P/N2P/N_1 > a^2 P / N_2, i.e., N2/N1>a2N_2/N_1 > a^2

a<1a < 1

The channel is always degraded regardless of parameters

Common Mistake: Not All Broadcast Channels Are Degraded

Mistake:

Assuming that because one receiver has higher noise variance, the broadcast channel must be degraded.

Correction:

Degradedness requires the Markov chain X→Y1→Y2X \to Y_1 \to Y_2, which means that user 2's observation can be simulated from user 1's. This holds for the Gaussian BC (stochastically) and many other channels, but not in general. For example, consider a channel where Y1=X+Z1Y_1 = X + Z_1 and Y2=f(X)+Z2Y_2 = f(X) + Z_2 for a non-invertible function ff. Even if Z2Z_2 has larger variance, user 2 might observe a different "projection" of XX that cannot be obtained from Y1Y_1.

The general (non-degraded) BC capacity region is unknown and requires more sophisticated techniques such as Marton coding.

Definition:

Broadcast Channel Code

An (2nR1,2nR2,n)(2^{nR_{1}}, 2^{nR_{2}}, n) broadcast code for the DM-BC consists of:

  1. Encoder: A mapping f:{1,…,2nR1}Γ—{1,…,2nR2}β†’Xnf: \{1,\ldots,2^{nR_{1}}\} \times \{1,\ldots,2^{nR_{2}}\} \to \mathcal{X}^n,

  2. Decoder 1: A mapping g1:Y1nβ†’{1,…,2nR1}g_1: \mathcal{Y}_1^n \to \{1,\ldots,2^{nR_{1}}\},

  3. Decoder 2: A mapping g2:Y2nβ†’{1,…,2nR2}g_2: \mathcal{Y}_2^n \to \{1,\ldots,2^{nR_{2}}\}.

The average probability of error is

Pe(n)=Pr⁑(g1(Y1n)β‰ W1Β orΒ g2(Y2n)β‰ W2),P_e^{(n)} = \Pr(g_1(Y_1^n) \neq W_1 \text{ or } g_2(Y_2^n) \neq W_2),

where (W1,W2)(W_1, W_2) is uniform over the message set.

Naive Approaches and Their Limitations

Before developing superposition coding, let us consider two naive strategies and see why they are suboptimal:

Time-sharing (TDMA): Allocate a fraction ΞΈ\theta of the time to user 1 and (1βˆ’ΞΈ)(1-\theta) to user 2. Each user's rate is limited by the capacity of its own channel scaled by its time fraction: R1≀θC1R_{1} \leq \theta C_{1} and R2≀(1βˆ’ΞΈ)C2R_{2} \leq (1-\theta) C_{2}. This traces a straight line between (C1,0)(C_{1}, 0) and (0,C2)(0, C_{2}).

Treating interference as noise: Encode for user 2 at rate R2R_{2}, then independently encode for user 1. User 2 treats user 1's signal as additional noise. User 1 treats user 2's signal as noise. This ignores the structure of the interference.

We will see that superposition coding achieves a rate region that is strictly larger than time-sharing whenever the users have different channel qualities. The secret is that the strong user can decode and subtract the weak user's message.

Superposition Coding

An encoding strategy for the broadcast channel where the weak user's message is encoded as a "cloud center" (coarse layer) and the strong user's message is encoded as a "satellite" (fine layer) around the cloud center. The weak user decodes only the cloud center; the strong user decodes both layers.

Related: Broadcast Channel (BC), Cloud-Satellite Structure

Cloud-Satellite Structure

The geometric interpretation of superposition coding. The codebook is organized into "clouds" (one per weak-user message) with "satellites" (one per strong-user message) within each cloud. Each cloud center represents a coarse codeword for the weak user; each satellite represents the fine detail for the strong user.

Related: Superposition Coding

Key Takeaway

The broadcast channel (one-to-many) and the MAC (many-to-one) are dual problems, but their information-theoretic analyses are fundamentally different. The MAC capacity region is known for general channels; the BC capacity region is known only for degraded (and some special) channels. The degraded BC capacity region is achieved by superposition coding β€” an encoding technique with no MAC analogue.

Broadcast Channel: Degraded vs. Non-Degraded

Visualize the broadcast channel model. In the degraded case, user 2's output is obtained by adding more noise to user 1's output. Compare the mutual information quantities for different noise levels.

Parameters
1
3
10

The Degraded Broadcast Channel Model

Visualization of the degraded broadcast channel: one transmitter, two receivers with different noise levels. The Markov chain X→Y1→Y2X \to Y_1 \to Y_2 means user 2 sees a noisier version of user 1's output — everything user 2 can decode, user 1 can too.