Channels with Causal State Information at the Encoder
Why Channels with State?
In many communication scenarios, the channel is influenced by a random parameter β the state β that is partially or fully known to one or both parties:
- Fading: the channel gain is the state, known at the receiver (from pilot estimation) and sometimes at the transmitter (via feedback).
- Interference: in a multiuser system, the signal intended for other users acts as state known at the transmitter (who generated it).
- Memory and storage: writing to a storage medium with defects, where the locations of defects are known.
The central question is: how does knowing the state help, and how much does it help? The answer depends critically on whether the state is known causally (up to the current time) or non-causally (the entire state sequence is known in advance), and whether it is known at the encoder, decoder, or both.
Definition: Channel with State
Channel with State
A discrete memoryless channel with state is defined by:
- Input alphabet , output alphabet , state alphabet ,
- State distribution (i.i.d. across time),
- Transition law .
The channel is memoryless: .
The state sequence is drawn i.i.d. , independent of the message. The encoder may have access to the state sequence in one of three modes:
- No state information: encoder sees only the message.
- Causal: at time , the encoder knows .
- Non-causal (acausal): the encoder knows the entire before transmission begins.
Channel state information (CSI)
Knowledge of the random state that affects the channel transition law. Causal CSI means knowing at time ; non-causal CSI means knowing the entire before transmission.
Related: Shannon strategy, Dirty paper coding (DPC)
Theorem: Capacity with Causal State Information at the Encoder
The capacity of a DMC with state, where the state is known causally at the encoder and not at the decoder, is
where the maximization is over all Shannon strategies , and the mutual information is computed with and generated by .
Equivalently, defining with :
where and .
The encoder uses a Shannon strategy: it chooses its input as a function of the message and the state observed so far. The key insight is that causal state information allows the encoder to adapt its strategy to the state, but it cannot "pre-cancel" future interference because it does not know future states. The auxiliary variable captures the encoder's "intention" β the action it would take before seeing the state β while is the actual transmission adapted to the current state.
Achievability
Generate a codebook of codewords i.i.d. . To send message , at time the encoder sets . The decoder uses joint typicality with the output to find .
By the packing lemma, the error probability vanishes if .
Converse
By Fano's inequality and the standard converse argument, any achievable rate satisfies for some and strategy .
Shannon strategy
An encoding scheme for channels with causal state information where the encoder chooses β adapting the auxiliary codeword symbol to the current state. Named for Shannon's 1958 analysis of channels with side information.
Related: Channel state information (CSI)
Example: Additive State Channel with Causal CSI
Consider where , , and . The state is known causally at the encoder. What is the capacity?
Without state information
If the encoder ignores the state, .
With causal CSI
With causal state knowledge, the encoder can partially cancel the state by choosing where carries the information. Then .
Optimizing over : the residual "noise" is with variance . The power constraint: , so .
Capacity: .
This is larger than but strictly less than (the non-causal result).
Causal vs. Non-Causal: A Fundamental Gap
The example above reveals a key distinction: with causal state information, the encoder can only partially compensate for the interference because it must allocate power for both the information signal and the state cancellation. With non-causal state information (Section 12.2), something remarkable happens: the interference can be completely canceled at no cost in power. This is Costa's dirty paper coding result β arguably the most surprising theorem in multiuser information theory.
Quick Check
For the channel with known causally at the encoder, does causal state knowledge help compared to no state knowledge?
Yes, the capacity is strictly larger than
No, causal knowledge provides no benefit
Yes, and it achieves the same capacity as non-causal knowledge:
It depends on the relative values of , , and
Causal state knowledge always helps (or at least does not hurt). The encoder can partially cancel the state by subtracting a fraction , trading off cancellation quality against power. However, the improvement is strictly less than non-causal knowledge, which allows perfect interference cancellation.