Ferkans — Interactive Telecom Tutor

From Uplink to Downlink

In Chapter 14, we studied the multiple access channel — the uplink scenario where multiple transmitters communicate with a single receiver. Now we turn to the dual problem: the broadcast channel (BC), modeling the downlink where a single transmitter sends independent messages to multiple receivers.

Intuitively, the BC should be the "easy" direction: the transmitter knows all messages and can design the transmitted signal jointly. Surprisingly, the BC is harder than the MAC from an information-theoretic standpoint. The capacity region for the general (non-degraded) BC remains unknown to this day! The key difficulty is that each receiver sees a different channel output, and the transmitter must simultaneously serve receivers with different channel qualities.

The breakthrough insight, due to Cover (1972) and Bergmans (1973), is that for degraded broadcast channels — where one receiver's output is a degraded version of the other's — the capacity region can be completely characterized. The achievability technique is superposition coding, one of the most elegant ideas in information theory.

Definition:
Discrete Memoryless Broadcast Channel (DM-BC)

A two-user discrete memoryless broadcast channel (DM-BC) consists of:

A finite input alphabet $\mathcal{X}$ ,
Two finite output alphabets $\mathcal{Y}_1$ and $\mathcal{Y}_2$ ,
A conditional PMF $p(y_1, y_2 | x)$ for each $(x, y_1, y_2) \in \mathcal{X} \times \mathcal{Y}_1 \times \mathcal{Y}_2$ .

The channel is memoryless: when used $n$ times,

$p(y_1^n, y_2^n | x^n) = \prod_{i=1}^{n} p(y_{1,i}, y_{2,i} | x_i).$

The transmitter has two independent messages: $W_1$ uniform over $\{1, \ldots, 2^{nR_{1}}\}$ for receiver 1, and $W_2$ uniform over $\{1, \ldots, 2^{nR_{2}}\}$ for receiver 2. The encoder maps the pair $(W_1, W_2)$ to a codeword $x^n(W_1, W_2) \in \mathcal{X}^n$ .

Receiver $k$ observes $Y_k^n$ and produces an estimate $\hat{W}_k$ . A rate pair $(R_{1}, R_{2})$ is achievable if there exists a sequence of codes such that

$\max\{\Pr(\hat{W}_1 \neq W_1), \Pr(\hat{W}_2 \neq W_2)\} \to 0$

as $n \to \infty$ . The capacity region $\mathcal{C}$ is the closure of all achievable rate pairs.

Notice the asymmetry with the MAC: in the MAC, two independent encoders send to one decoder. In the BC, one encoder sends to two independent decoders. The encoder sees both messages, but each decoder sees only its own channel output.

Broadcast Channel (BC)

A multiuser channel model with one transmitter sending independent messages to two or more receivers. Each receiver observes a (potentially different) noisy version of the transmitted signal and must decode its own message.

Historical Note: Cover's 1972 Broadcast Channel Paper

1970s

Thomas Cover introduced the broadcast channel in his seminal 1972 paper "Broadcast channels," which appeared in the IEEE Transactions on Information Theory. Cover posed the problem of determining the capacity region and proposed superposition coding as a candidate achievability scheme. He showed that for degraded channels, superposition coding achieves the capacity region, and conjectured that this could be extended more broadly.

What makes Cover's contribution remarkable is the conceptual leap: rather than treating interference as noise (the naive approach), he proposed layering the information intended for different users at different "resolutions." This idea — encoding the weak user's message as a coarse description and the strong user's message as a refinement — became the foundation for all subsequent work on broadcast channels.

The full converse for the degraded Gaussian case required the entropy power inequality, established by Bergmans (1973). The general (non-degraded) BC capacity region remains one of the great open problems in information theory.

Definition:
Physically and Stochastically Degraded Broadcast Channel

A broadcast channel $p(y_1, y_2 | x)$ is physically degraded if

$X \to Y_1 \to Y_2$

forms a Markov chain, i.e., $p(y_2 | x, y_1) = p(y_2 | y_1)$ . This means receiver 2's output is obtained by passing receiver 1's output through an additional noisy channel. Receiver 1 is the strong user and receiver 2 is the weak user.

More generally, a broadcast channel is stochastically degraded if there exists a channel $p(\tilde{y}_2 | y_1)$ such that

$p(\tilde{y}_2 | x) = \sum_{y_1} p(y_1 | x) p(\tilde{y}_2 | y_1) = p(y_2 | x)$

for all $x$ , i.e., the marginal $p(y_2 | x)$ can be factored through $Y_1$ even if the physical channel does not have this structure.

The capacity region of a stochastically degraded BC equals that of the corresponding physically degraded BC with marginals $p(y_1|x)$ and $p(y_2|x)$ .

The distinction between physical and stochastic degradedness is important: many channels that are not physically degraded are stochastically degraded. For the capacity region, only the marginals $p(y_1|x)$ and $p(y_2|x)$ matter, not the joint $p(y_1, y_2|x)$ .

,

Degraded Broadcast Channel

A broadcast channel where $X \to Y_1 \to Y_2$ forms a Markov chain, meaning receiver 2's observation is a noisier version of receiver 1's. The capacity region is completely known for degraded BCs via superposition coding.

What Degradedness Means Operationally

The Markov chain $X \to Y_1 \to Y_2$ has a powerful operational implication: anything that receiver 2 can compute from $Y_2^n$ , receiver 1 can also compute from $Y_1^n$ (at least in the information-theoretic sense).

Intuitively, if we think of $Y_1$ as a high-resolution photograph and $Y_2$ as a blurry copy obtained by further degrading $Y_1$ , then any feature visible in the blurry copy is certainly visible in the sharp original. This means the strong user can always decode the weak user's message — a property that superposition coding exploits.

Example: Degraded Binary Symmetric Broadcast Channel

Consider a broadcast channel where $X \in \{0,1\}$ , $Y_1 = X \oplus Z_1$ , and $Y_2 = Y_1 \oplus Z_3$ , where $Z_1 \sim \text{Bernoulli}(p_1)$ and $Z_3 \sim \text{Bernoulli}(p_3)$ are independent. Show that this is a degraded BC and find the crossover probabilities for both users.

Solution

Verify the Markov chain

By construction, $Y_2 = Y_1 \oplus Z_3$ depends on $X$ only through $Y_1$ , so $X \to Y_1 \to Y_2$ is a Markov chain. This is a physically degraded broadcast channel.

Crossover probability for user 1

User 1 sees a BSC with crossover probability $p_1$ :

$\Pr(Y_1 \neq X) = \Pr(Z_1 = 1) = p_1.$

Crossover probability for user 2

User 2 sees $Y_2 = X \oplus Z_1 \oplus Z_3$ . Since $Z_1$ and $Z_3$ are independent Bernoulli, $Z_1 \oplus Z_3$ is Bernoulli with parameter

$p_2 = p_1(1-p_3) + (1-p_1)p_3 = p_1 * p_3,$

where $*$ denotes binary convolution (not multiplication). Since $0 < p_1, p_3 < 1/2$ , we have $p_2 > p_1$ , confirming that user 2 has a worse channel.

Interpretation

User 1's channel is a $\text{BSC}(p_1)$ ; user 2's channel is a $\text{BSC}(p_1 * p_3)$ . Since $p_1 * p_3 > p_1$ (for $p_1, p_3 \in (0, 1/2)$ ), user 2 is indeed the weaker user. The capacity of user 1's channel exceeds that of user 2's: $1 - h_b(p_1) > 1 - h_b(p_1 * p_3)$ .

Example: The Gaussian Broadcast Channel is Stochastically Degraded

Consider the Gaussian BC: $Y_1 = X + Z_{1}$ and $Y_2 = X + Z_{2}$ , where $Z_{1} \sim \mathcal{N}(0, N_1)$ and $Z_{2} \sim \mathcal{N}(0, N_2)$ are independent with $N_1 < N_2$ . Show that this channel is stochastically degraded.

Solution

Construct the degrading channel

We need to find a channel $p(\tilde{y}_2 | y_1)$ such that $\tilde{Y}_2$ has the same marginal distribution as $Y_2$ given $X$ .

Let $\tilde{Y}_2 = Y_1 + Z_{3}$ where $Z_{3} \sim \mathcal{N}(0, N_2 - N_1)$ , independent of everything else. Then:

$\tilde{Y}_2 = X + Z_{1} + Z_{3}.$

Verify the marginal

Since $Z_{1}$ and $Z_{3}$ are independent Gaussians, $Z_{1} + Z_{3} \sim \mathcal{N}(0, N_1 + (N_2 - N_1)) = \mathcal{N}(0, N_2)$ .

So $\tilde{Y}_2 | X = x \sim \mathcal{N}(x, N_2)$ , which matches the distribution of $Y_2 | X = x$ . The channel is stochastically degraded.

Note on physical degradation

The original channel is not physically degraded because $Z_{1}$ and $Z_{2}$ are independent — $Y_2$ is not obtained by adding noise to $Y_1$ . However, the stochastically degraded model $X \to Y_1 \to \tilde{Y}_2$ has the same marginals, so the capacity regions are identical.

This is a general and important principle: for capacity analysis, we can always replace a stochastically degraded BC with an equivalent physically degraded one.

Quick Check

Consider a broadcast channel where $Y_1 = X + Z_1$ and $Y_2 = aX + Z_2$ with $a > 0$ , $Z_1 \sim \mathcal{N}(0, N_1)$ , and $Z_2 \sim \mathcal{N}(0, N_2)$ , all independent. Under what condition is this channel stochastically degraded with user 1 stronger?

$N_1 < N_2$ always

$P/N_1 > a^2 P / N_2$ , i.e., $N_2/N_1 > a^2$

$a < 1$

The channel is always degraded regardless of parameters

Correction:

P/N_1 > a^2 P / N_2

, i.e.,

N_2/N_1 > a^2

User 1 is stronger when $\\text{SNR}_1 > \\text{SNR}_2$ , i.e., $P/N_1 > a^2 P/N_2$ , which gives $N_2 > a^2 N_1$ . The channel can be reduced to the standard form by normalizing user 2's output.

Common Mistake: Not All Broadcast Channels Are Degraded

Mistake:

Assuming that because one receiver has higher noise variance, the broadcast channel must be degraded.

Correction:

Degradedness requires the Markov chain $X \to Y_1 \to Y_2$ , which means that user 2's observation can be simulated from user 1's. This holds for the Gaussian BC (stochastically) and many other channels, but not in general. For example, consider a channel where $Y_1 = X + Z_1$ and $Y_2 = f(X) + Z_2$ for a non-invertible function $f$ . Even if $Z_2$ has larger variance, user 2 might observe a different "projection" of $X$ that cannot be obtained from $Y_1$ .

The general (non-degraded) BC capacity region is unknown and requires more sophisticated techniques such as Marton coding.

Definition:
Broadcast Channel Code

An $(2^{nR_{1}}, 2^{nR_{2}}, n)$ broadcast code for the DM-BC consists of:

Encoder: A mapping $f: \{1,\ldots,2^{nR_{1}}\} \times \{1,\ldots,2^{nR_{2}}\} \to \mathcal{X}^n$ ,
Decoder 1: A mapping $g_1: \mathcal{Y}_1^n \to \{1,\ldots,2^{nR_{1}}\}$ ,
Decoder 2: A mapping $g_2: \mathcal{Y}_2^n \to \{1,\ldots,2^{nR_{2}}\}$ .

The average probability of error is

$P_e^{(n)} = \Pr(g_1(Y_1^n) \neq W_1 \text{ or } g_2(Y_2^n) \neq W_2),$

where $(W_1, W_2)$ is uniform over the message set.

Naive Approaches and Their Limitations

Before developing superposition coding, let us consider two naive strategies and see why they are suboptimal:

Time-sharing (TDMA): Allocate a fraction $\theta$ of the time to user 1 and $(1-\theta)$ to user 2. Each user's rate is limited by the capacity of its own channel scaled by its time fraction: $R_{1} \leq \theta C_{1}$ and $R_{2} \leq (1-\theta) C_{2}$ . This traces a straight line between $(C_{1}, 0)$ and $(0, C_{2})$ .

Treating interference as noise: Encode for user 2 at rate $R_{2}$ , then independently encode for user 1. User 2 treats user 1's signal as additional noise. User 1 treats user 2's signal as noise. This ignores the structure of the interference.

We will see that superposition coding achieves a rate region that is strictly larger than time-sharing whenever the users have different channel qualities. The secret is that the strong user can decode and subtract the weak user's message.

Superposition Coding

An encoding strategy for the broadcast channel where the weak user's message is encoded as a "cloud center" (coarse layer) and the strong user's message is encoded as a "satellite" (fine layer) around the cloud center. The weak user decodes only the cloud center; the strong user decodes both layers.

Cloud-Satellite Structure

The geometric interpretation of superposition coding. The codebook is organized into "clouds" (one per weak-user message) with "satellites" (one per strong-user message) within each cloud. Each cloud center represents a coarse codeword for the weak user; each satellite represents the fine detail for the strong user.

Related: Superposition Coding

Why This Matters: The Broadcast Channel as the Cellular Downlink

The broadcast channel directly models the cellular downlink: the base station (transmitter) sends independent data streams to multiple mobile users (receivers). In a typical cell, users are at different distances from the base station and therefore experience different path losses, making the channel approximately degraded.

Superposition coding — under the name NOMA (Non-Orthogonal Multiple Access) — has been studied extensively for 5G. The base station allocates more power to the cell-edge (weak) user and less power to the cell-center (strong) user. The strong user performs successive interference cancellation (SIC) to first decode and subtract the weak user's message, then decode its own.

While practical NOMA implementations face challenges (imperfect SIC, channel estimation errors, user pairing), the information-theoretic superiority of superposition coding over orthogonal access (OFDMA) is clear: superposition coding achieves the capacity region, while OFDMA achieves only a subset.

See full treatment in Chapter 16

Key Takeaway

The broadcast channel (one-to-many) and the MAC (many-to-one) are dual problems, but their information-theoretic analyses are fundamentally different. The MAC capacity region is known for general channels; the BC capacity region is known only for degraded (and some special) channels. The degraded BC capacity region is achieved by superposition coding — an encoding technique with no MAC analogue.

Broadcast Channel: Degraded vs. Non-Degraded

Visualize the broadcast channel model. In the degraded case, user 2's output is obtained by adding more noise to user 1's output. Compare the mutual information quantities for different noise levels.

Parameters

N_1

(strong user noise)1

N_2

(weak user noise)3

P

(transmit power)10

The Degraded Broadcast Channel Model

Visualization of the degraded broadcast channel: one transmitter, two receivers with different noise levels. The Markov chain

X \to Y_1 \to Y_2

means user 2 sees a noisier version of user 1's output — everything user 2 can decode, user 1 can too.

The Broadcast Channel Model