Ferkans — Interactive Telecom Tutor

Side Information at the Decoder

What if the state is known at the decoder instead of (or in addition to) the encoder? This scenario arises naturally in fading channels where the receiver estimates the channel (CSIR) through pilot symbols.

The answer turns out to be much simpler than the encoder-side case: the decoder simply uses the state as additional observations, and the capacity is $\max_{P_X} I(X; Y, S)$ . There is no need for binning or auxiliary variables — the state directly improves the decoder's ability to distinguish codewords.

The asymmetry between encoder-side and decoder-side state information is one of the recurring themes of multiuser information theory.

Theorem: Capacity with State Known at the Decoder

The capacity of a DMC with state $(P_{Y|X,S}, P_S)$ where the state $S^n$ is known at the decoder (but not at the encoder) is

$C = \max_{P_X} I(X; Y, S) = \max_{P_X} I(X; Y | S),$

where $X \perp S$ and the second equality follows from the independence of $X$ and $S$ .

The decoder knows $(Y^n, S^n)$ and uses both for decoding. Since the encoder does not know $S$ , the best it can do is choose $P_X$ to maximize the mutual information as if $S$ were part of the channel output. The condition $I(X; Y, S) = I(X; Y|S)$ holds because $X \perp S$ .

Proof

Achievability

Standard random coding: generate $2^{nR}$ codewords $\mathbf{x}(m) \sim P_X^n$ . The decoder finds the unique $m$ such that $(\mathbf{x}(m), \mathbf{y}, \mathbf{s})$ is jointly typical. Error vanishes if $R < I(X; Y, S)$ .

Converse

By Fano's inequality: $nR \leq I(X^n; Y^n, S^n) + n\epsilon_n$ . Since the channel is memoryless and $X \perp S$ : $I(X^n; Y^n, S^n) \leq \sum_i I(X_i; Y_i, S_i) \leq n \max_{P_X} I(X; Y, S)$ .

Example: Gaussian Channel with State at the Decoder

For $Y = X + S + Z$ with $S \sim \mathcal{N}(0, Q)$ , $Z \sim \mathcal{N}(0, N)$ , and state known at the decoder only, compute the capacity.

Solution

Compute $\ntn{mi}(X; Y | S)$

Given $S$ , the channel becomes $Y - S = X + Z$ , which is a standard AWGN channel. Therefore:

$I(X; Y|S) = \frac{1}{2}\log\!\left(1 + \frac{P}{N}\right).$

Compare with other cases

No state information: $C = \frac{1}{2}\log(1 + P/(Q+N))$ .
State at decoder only: $C = \frac{1}{2}\log(1 + P/N)$ .
State at encoder only (non-causal, DPC): $C = \frac{1}{2}\log(1 + P/N)$ .

Remarkably, for the Gaussian channel, non-causal encoder state information and decoder state information give the same capacity! This is not true in general — for discrete channels, encoder-side non-causal state can give higher capacity than decoder-side state.

Capacity with Different State Information Configurations

Configuration	Capacity (Gaussian)	Comment
No state info	$\frac{1}{2}\log(1 + P/(Q+N))$	Interference treated as noise
Causal at encoder	$\max_\alpha \frac{1}{2}\log\frac{P - \alpha^2 Q}{(1-\alpha)^2 Q + N} + 1$	Partial cancellation possible
Non-causal at encoder (DPC)	$\frac{1}{2}\log(1 + P/N)$	Costa: interference eliminated
At decoder only (CSIR)	$\frac{1}{2}\log(1 + P/N)$	State subtracted at decoder
At both (full CSI)	$\frac{1}{2}\log(1 + (\sqrt{P} + \sqrt{Q})^2/N)$	Coherent combining of $X$ and $S$

The Asymmetry of Side Information

The comparison table reveals a subtle asymmetry:

Decoder state is always helpful (it can only improve decoding).
Encoder state (causal) helps partially but not completely.
Encoder state (non-causal) achieves the same as decoder state for the Gaussian channel, but via a completely different mechanism (binning vs. direct observation).
Full CSI (state at both) is strictly better than either alone, because the encoder can coherently add to the state signal.

The Gaussian case is special because DPC and CSIR give the same result. For discrete channels, the Gel'fand-Pinsker capacity can exceed the CSIR capacity — the encoder can exploit the state in ways that simply observing it at the decoder cannot.

Capacity Under Different State Information Configurations

Compare the channel capacity for the Gaussian state channel $Y = X + S + Z$ under different CSI configurations. Adjust the interference power $Q$ to see how the gap between configurations changes.

Parameters

Signal power

P

10

Noise power

N

1

Maximum interference power

Q

50

Common Mistake: Full CSI Is Not Just DPC + CSIR

Mistake:

Assuming that having state at both encoder and decoder gives capacity $\frac{1}{2}\log(1 + P/N)$ (same as one-sided).

Correction:

When state is known at both sides, the encoder can coherently align $X$ with $S$ , achieving $C = \frac{1}{2}\log(1 + (\sqrt{P} + \sqrt{Q})^2/N)$ . This is strictly larger than $\frac{1}{2}\log(1 + P/N)$ when $Q > 0$ . The encoder treats $S$ as a free power source, allocating $X$ to constructively add to $S$ . This is the principle behind energy harvesting and simultaneous wireless information and power transfer.

Quick Check

For $Y = X + S + Z$ with $S$ known only at the decoder, $P = 5$ , $Q = 100$ , $N = 2$ , the capacity is:

$\frac{1}{2}\log(1 + 5/102) \approx 0.034$ bits

$\frac{1}{2}\log(1 + 5/2) = \frac{1}{2}\log(3.5) \approx 0.91$ bits

$\frac{1}{2}\log(1 + 105/2) \approx 2.88$ bits

Depends on $Q$