Ferkans — Interactive Telecom Tutor

The Broadcast Channel: One Transmitter, Many Receivers

The broadcast channel (BC) is the dual of the MAC: a single transmitter sends independent messages to $K$ receivers. This models the cellular downlink, Wi-Fi AP-to-STA transmission, and satellite broadcasting. Unlike the MAC, the BC capacity region was unknown for decades. The key breakthrough was superposition coding for the degraded BC (Cover, 1972) and dirty-paper coding (DPC) for the MIMO BC (Weingarten, Steinberg, and Shamai, 2006). The BC capacity region reveals a remarkable duality with the MAC: the MIMO BC capacity region can be computed via the MAC-BC duality, transforming a non-convex problem into a convex one.

Definition:
Degraded Broadcast Channel

A two-user broadcast channel is physically degraded if the channel outputs form a Markov chain:

$X \to Y_1 \to Y_2$

meaning user 2's observation is a degraded version of user 1's.

Gaussian degraded BC: The transmitter sends $X$ with power constraint $E[X^2] \leq P$ . User $k$ observes:

$Y_k = X + Z_k, \quad Z_k \sim \mathcal{N}(0, N_k)$

with $N_1 < N_2$ (user 1 has a better channel). This is a degraded BC because $Y_2 = Y_1 + (Z_2 - Z_1)$ can be written as a degraded version of $Y_1$ when $N_1 < N_2$ .

The BC is stochastically degraded if the marginal distribution $p(y_2 | x)$ can be expressed as $\sum_{y_1} p(y_2 | y_1) p(y_1 | x)$ , even if the physical channel does not form a Markov chain.

The degraded BC model applies directly to the scalar Gaussian downlink where users experience different path losses. The MIMO BC is generally not degraded, requiring the more powerful DPC approach.

,

Theorem: Capacity Region of the Degraded Gaussian BC

The capacity region of the two-user degraded Gaussian BC ( $N_1 < N_2$ , power $P$ ) is the union over $0 \leq \alpha \leq 1$ of:

$R_1 \leq \frac{1}{2}\log_2\!\left(1 + \frac{(1-\alpha)P}{N_1 + \alpha P}\right) = \frac{1}{2}\log_2\!\left(\frac{N_1 + P}{N_1 + \alpha P}\right)$

$R_2 \leq \frac{1}{2}\log_2\!\left(1 + \frac{\alpha P}{N_2}\right)$

where $\alpha P$ is the power allocated to user 2's message and $(1-\alpha)P$ to user 1's message in the superposition code.

The boundary is a concave curve parameterised by $\alpha \in [0,1]$ :

$\alpha = 0$ : all power to user 1, $R_1 = C_1$ , $R_2 = 0$ .
$\alpha = 1$ : all power to user 2, $R_1 = 0$ , $R_2 = \frac{1}{2}\log_2(1 + P/N_2)$ .

Superposition coding encodes two messages at different "layers." User 2 (weaker) decodes only its own message, treating user 1's signal as noise. User 1 (stronger) first decodes user 2's message (which it can do because it has a better channel), subtracts it, and then decodes its own message interference-free. The power split $\alpha$ controls the rate trade-off.

Proof

Achievability: Superposition coding

Codebook generation: Generate $2^{nR_2}$ cloud centre codewords $\mathbf{u}(w_2) \sim \mathcal{N}(0, \alpha P)$ i.i.d.

For each cloud centre $\mathbf{u}(w_2)$ , generate $2^{nR_1}$ satellite codewords: $\mathbf{x}(w_1, w_2) = \mathbf{u}(w_2) + \mathbf{v}(w_1)$ where $\mathbf{v}(w_1) \sim \mathcal{N}(0, (1-\alpha)P)$ i.i.d., independent of $\mathbf{u}$ .

Decoding at user 2 (weaker): Decode $w_2$ by finding the cloud centre $\mathbf{u}(\hat{w}_2)$ jointly typical with $\mathbf{y}_2$ . Treating $\mathbf{v}$ as noise, reliable if $R_2 < \frac{1}{2}\log_2(1 + \alpha P/((1-\alpha)P + N_2))$ .

Simplification (since user 2 cannot decode user 1 anyway): $R_2 < \frac{1}{2}\log_2(1 + \alpha P/N_2)$ suffices by treating $\mathbf{v}$ as part of the noise in a tighter analysis.

Decoding at user 1 (stronger):

Decode $w_2$ first (possible since user 1 has a better channel): reliable if $R_2 < \frac{1}{2}\log_2(1 + \alpha P/((1-\alpha)P + N_1))$ .
Subtract $\mathbf{u}(\hat{w}_2)$ from $\mathbf{y}_1$ .
Decode $w_1$ from $\mathbf{y}_1 - \mathbf{u}(\hat{w}_2) = \mathbf{v}(w_1) + \mathbf{z}_1$ : reliable if $R_1 < \frac{1}{2}\log_2(1 + (1-\alpha)P/N_1)$ .

Since $N_1 < N_2$ , user 1's constraint on $R_2$ is automatically satisfied whenever user 2's is.

Converse: Entropy power inequality

The converse uses the entropy power inequality (EPI). For any code with $P_e^{(n)} \to 0$ :

$nR_2 \leq I(W_2; Y_2^n) + n\epsilon_n$

Since $X \to Y_1 \to Y_2$ (degraded), data processing gives $I(W_2; Y_2^n) \leq I(W_2; Y_1^n)$ .

Using the EPI-based technique of Bergmans (1973): $I(X^n; Y_2^n) \leq \frac{n}{2}\log_2\!\left(1 + \frac{P}{N_2}\right)$

The key step is showing that for any coding scheme achieving rates $(R_1, R_2)$ , there exists an $\alpha$ such that: $R_2 \leq \frac{1}{2}\log_2(1 + \alpha P / N_2)$ $R_1 \leq \frac{1}{2}\log_2\!\left(\frac{N_1 + P}{N_1 + \alpha P}\right)$

This follows from the EPI: $e^{2h(X+Z_2)/n} \geq e^{2h(X)/n} + e^{2h(Z_2)/n}$ , combined with the power constraint and Fano's inequality. $\blacksquare$

,

Definition:
Dirty-Paper Coding (DPC)

Dirty-paper coding (Costa, 1983) addresses the channel:

$Y = X + S + Z$

where $S$ is interference known non-causally at the transmitter (but not at the receiver), $X$ has power constraint $P$ , and $Z \sim \mathcal{N}(0, N)$ .

Costa's theorem: The capacity is:

$C = \frac{1}{2}\log_2\!\left(1 + \frac{P}{N}\right)$

— the same as if $S$ were absent. The transmitter can completely pre-cancel the known interference without additional power, by encoding against it using a structured lattice or random binning code.

The optimal auxiliary random variable is $U = X + \alpha S$ where $\alpha = P/(P + N)$ , and the encoding is based on random binning in the joint typicality framework.

DPC is named after the analogy: writing on dirty paper (known stains $S$ ) does not reduce the amount of information that can be conveyed, as long as the writer knows the stain pattern. DPC is the information-theoretic foundation of precoding at the base station in MIMO downlink (Chapter 17).

Theorem: MAC-BC Duality and MIMO BC Capacity

Consider the MIMO BC with $n_t$ transmit antennas, $K$ single-antenna users, and channels $\mathbf{h}_k^H$ (row vectors). The capacity region of the MIMO BC under sum power constraint $P$ is:

$\mathcal{C}_{\mathrm{BC}} = \bigcup_{\substack{\mathbf{K}_1, \ldots, \mathbf{K}_K \succeq 0 \\ \mathrm{tr}(\sum_k \mathbf{K}_k) \leq P}} \left\{ (R_1, \ldots, R_K) : R_k \leq \log_2\!\left(\frac{|\mathbf{I} + \sum_{j \leq k} \mathbf{h}_k^H \mathbf{K}_j \mathbf{h}_k|}{|\mathbf{I} + \sum_{j < k} \mathbf{h}_k^H \mathbf{K}_j \mathbf{h}_k|}\right) \right\}$

achieved by DPC with encoding order $K, K-1, \ldots, 1$ .

MAC-BC duality (Vishwanath, Jindal, Goldsmith 2003): The capacity region of the MIMO BC with sum power $P$ equals the capacity region of the dual MIMO MAC (same channels, reversed direction) with sum power $P$ :

$\mathcal{C}_{\mathrm{BC}}(P) = \bigcup_{\substack{p_1, \ldots, p_K \geq 0 \\ \sum_k p_k \leq P}} \mathcal{C}_{\mathrm{MAC}}(p_1, \ldots, p_K)$

This duality transforms the non-convex BC optimisation into the convex MAC problem, enabling efficient computation.

The MAC-BC duality states that any rate tuple achievable on the BC with DPC is also achievable on the dual MAC with SIC, and vice versa, under the same sum power constraint. This is remarkable because the BC uses DPC (non-linear precoding) while the MAC uses SIC (non-linear decoding), yet they achieve the same rate region.

Proof

BC achievability via DPC

The DPC scheme encodes users in order $K, K-1, \ldots, 1$ . User $K$ is encoded first (as a standard Gaussian code). User $K-1$ uses DPC to pre-cancel user $K$ 's interference (known at the transmitter). User $k$ uses DPC to pre-cancel the interference from users $k+1, \ldots, K$ .

User $k$ 's rate is determined by the effective SINR after DPC pre-cancellation: $R_k = \log_2\!\left(1 + \frac{\mathbf{h}_k^H \mathbf{K}_k \mathbf{h}_k}{1 + \sum_{j < k} \mathbf{h}_k^H \mathbf{K}_j \mathbf{h}_k}\right)$

Duality transformation

The key insight is the SINR duality: for any set of BC covariance matrices $\{\mathbf{K}_k\}$ with $\sum_k \mathrm{tr}(\mathbf{K}_k) \leq P$ , there exist MAC powers $\{p_k\}$ with $\sum_k p_k \leq P$ that achieve the same SINRs (and hence rates) for all users.

The transformation uses the uplink-downlink duality of beamforming: the optimal BC beamformers are the same as the optimal MAC receive filters (up to scaling), and the power allocation can be computed by solving a system of linear equations. $\blacksquare$

,

Superposition Coding for the Broadcast Channel

Visualise the two-layer superposition coding strategy: the transmitter encodes messages at different power levels (cloud centres for the weak user, satellites for the strong user). The strong user decodes the weak user's message first, subtracts it, then decodes its own message interference-free.

Superposition coding: user 2 (weak) gets power

\alpha P

in cloud centres; user 1 (strong) gets

(1-\alpha)P

as satellites.

Degraded Gaussian BC Capacity Region

Visualise the capacity region of the two-user degraded Gaussian broadcast channel. The curve shows the achievable rate pairs $(R_1, R_2)$ parameterised by the power split $\alpha$ . User 1 has noise variance $N_1$ (stronger user) and user 2 has $N_2$ (weaker user). Adjust the total power $P$ and noise variances to see how the region changes. The slider $\alpha$ highlights the current operating point on the boundary.

Parameters

Total power

P

10

User 1 noise variance

N_1

(strong user)1

User 2 noise variance

N_2

(weak user)3

Power split

\alpha

(fraction to user 2)0.3

MIMO BC Sum Capacity

Compute and visualise the sum capacity of the MIMO broadcast channel as a function of SNR. The plot compares DPC (optimal), zero-forcing beamforming, and time-division to $K$ users. Adjust the number of transmit antennas $n_t$ , number of users $K$ , and SNR to observe the multiplexing gain. At high SNR, ZF approaches DPC, while TDMA saturates at $\log(\text{SNR})$ regardless of $K$ .

Parameters

Transmit antennas

n_t

4

Number of users

K

2

SNR (dB)15

Example: Superposition Coding Rate Computation

A base station with power $P = 20$ serves two users with noise variances $N_1 = 1$ (near user) and $N_2 = 10$ (far user).

(a) Compute the capacity region boundary for $\alpha \in \{0, 0.3, 0.5, 0.7, 1\}$ . (b) What power split maximises the sum rate? (c) Compare with TDMA (time-sharing between two single-user transmissions).

Solution

Boundary points

(a) For each $\alpha$ :

$\alpha$	$R_1$	$R_2$	$R_1 + R_2$
0	2.16	0	2.16
0.3	1.75	0.36	2.11
0.5	1.47	0.50	1.97
0.7	1.10	0.59	1.69
1	0	0.74	0.74

$R_1 = \frac{1}{2}\log_2((1+20)/(1+\alpha \cdot 20))$ , $R_2 = \frac{1}{2}\log_2(1 + \alpha \cdot 20/10)$ .

Maximum sum rate

(b) The sum rate is maximised at $\alpha = 0$ (all power to the strong user): $R_{\mathrm{sum}} = 2.16$ bits/c.u.

This is a general property: sum-rate maximisation in the degraded BC allocates all power to the strongest user. Fairness considerations require $\alpha > 0$ .

TDMA comparison

(c) TDMA: fraction $\tau$ to user 1, $(1-\tau)$ to user 2. $R_1^{\mathrm{TDMA}} = \frac{\tau}{2}\log_2(1 + P/(\tau N_1))$ , $R_2^{\mathrm{TDMA}} = \frac{1-\tau}{2}\log_2(1 + P/((1-\tau)N_2))$ .

At $\tau = 0.5$ : $R_1 = 1.68$ , $R_2 = 0.50$ , sum $= 2.18$ .

Superposition coding achieves a strictly larger region than TDMA for all non-trivial rate pairs (the BC capacity region strictly contains the TDMA region). $\blacksquare$

Quick Check

In dirty-paper coding for the channel $Y = X + S + Z$ where $S$ is known at the transmitter, what is the capacity?

$\frac{1}{2}\log_2(1 + P/(N + \sigma_S^2))$ because $S$ acts as additional noise

$\frac{1}{2}\log_2(1 + (P + \sigma_S^2)/N)$ because the transmitter can use $S$ as extra power

$\frac{1}{2}\log_2(1 + P/N)$ because the known interference can be completely pre-cancelled

$\frac{1}{2}\log_2(1 + P/N) - \frac{1}{2}\log_2(1 + \sigma_S^2/N)$ due to partial pre-cancellation

Correction:

\frac{1}{2}\log_2(1 + P/N)

because the known interference can be completely pre-cancelled

Costa's dirty-paper coding theorem (1983) shows that the capacity is $\frac{1}{2}\log_2(1 + P/N)$ , exactly as if the interference $S$ were not present. The transmitter can completely pre-cancel the known interference without using any additional power, by encoding against it using structured codes (lattice codes or random binning). This result is the foundation of precoding in MIMO broadcast channels.

Common Mistake: Wrong Superposition Decoding Order

Mistake:

Having the weak user decode the strong user's message first and subtract it, then decode its own — i.e., reversing the SIC order in the BC.

Correction:

In the degraded BC, the strong user (lower noise) decodes the weak user's cloud-centre message first (which it can do because its channel is better), subtracts it, and decodes its own. The weak user decodes only its own message, treating the strong user's signal as noise. Reversing this order fails because the weak user cannot reliably decode the strong user's message (it sees it through a noisier channel). This asymmetry is fundamental to superposition coding.

Historical Note: Costa's Dirty-Paper Coding Theorem

1983

Max Costa's 1983 paper "Writing on Dirty Paper" proved one of the most surprising results in information theory: known interference at the transmitter can be completely pre-cancelled at no cost in rate. The result was initially met with scepticism — it seemed too good to be true that arbitrarily strong interference could be pre-subtracted without using any additional power. The proof uses random binning and Gel'fand-Pinsker coding. Costa's theorem lay dormant for nearly two decades until Caire and Shamai (2003) and Weingarten, Steinberg, and Shamai (2006) showed that DPC achieves the capacity region of the MIMO broadcast channel, transforming it from a theoretical curiosity into the foundation of modern multi-user MIMO precoding.

Why This Matters: From DPC Theory to Practical MIMO Precoding

The MIMO BC capacity region (achieved by DPC) provides the theoretical benchmark for all practical downlink precoding schemes. The MIMO book develops the practical side: zero-forcing and regularised ZF precoding that approach DPC performance in the massive MIMO regime ( $N_t \gg K$ ), JSDM (Adhikary/Nam/Ahn/Caire) for FDD massive MIMO that exploits channel covariance structure, and cell-free massive MIMO where distributed precoding must account for fronthaul constraints. The gap between DPC and linear precoding shrinks with more antennas — a key insight that makes massive MIMO practical.

Superposition Coding

A coding strategy for the broadcast channel where the transmitter encodes multiple messages at different power levels (layers). Stronger receivers decode and strip weaker users' messages first, then decode their own. Achieves the capacity of the degraded BC.

Related: Dirty-Paper Coding (DPC)

Dirty-Paper Coding (DPC)

A coding technique for channels with non-causally known interference at the transmitter. Costa's theorem shows that known interference can be pre-cancelled without rate loss. DPC achieves the capacity region of the MIMO broadcast channel.

MAC-BC Duality

The capacity region of the MIMO BC with sum power $P$ equals the union of MAC capacity regions over all power allocations summing to $P$ . This transforms the non-convex BC optimisation into a convex problem.

Related: Dirty-Paper Coding (DPC)

Broadcast Channel Capacity

The Broadcast Channel: One Transmitter, Many Receivers

Definition: Degraded Broadcast Channel

Theorem: Capacity Region of the Degraded Gaussian BC

Achievability: Superposition coding

Converse: Entropy power inequality

Definition: Dirty-Paper Coding (DPC)

Theorem: MAC-BC Duality and MIMO BC Capacity

BC achievability via DPC

Duality transformation

Superposition Coding for the Broadcast Channel

Degraded Gaussian BC Capacity Region

Parameters

MIMO BC Sum Capacity

Parameters

Example: Superposition Coding Rate Computation

Boundary points

Maximum sum rate

TDMA comparison

Quick Check

Common Mistake: Wrong Superposition Decoding Order

Historical Note: Costa's Dirty-Paper Coding Theorem

Why This Matters: From DPC Theory to Practical MIMO Precoding

Superposition Coding

Dirty-Paper Coding (DPC)

MAC-BC Duality

Definition:
Degraded Broadcast Channel

Definition:
Dirty-Paper Coding (DPC)