Rate-Distortion with Side Information (Wyner-Ziv)
Side Information at the Decoder
Suppose the decoder has access to side information correlated with the source . Intuitively, the decoder can use to reduce the number of bits it needs from the encoder. The Wyner-Ziv theorem (1976) quantifies this precisely. The surprise is that for Gaussian sources with squared-error distortion, having side information only at the decoder is just as good as having it at both encoder and decoder. This "no loss from binning" result is remarkable and has profound implications for distributed compression and sensor networks.
Definition: Wyner-Ziv Rate-Distortion
Wyner-Ziv Rate-Distortion
The Wyner-Ziv rate-distortion function for source with decoder side information and distortion measure is: where is an auxiliary random variable forming the Markov chain , and is the decoder reconstruction function that uses both and .
Theorem: Wyner-Ziv Theorem
For a memoryless source-side information pair with distortion measure :
- The minimum achievable rate with side information at both encoder and decoder is .
- The minimum achievable rate with side information only at the decoder (Wyner-Ziv) is as defined above.
- In general, β not having side information at the encoder hurts.
- Gaussian case: For jointly Gaussian and squared-error distortion, β side information at the encoder is worthless!
The Wyner-Ziv encoder uses binning: it groups codewords into bins and sends only the bin index. The decoder, knowing , can disambiguate which codeword within the bin was intended. The rate savings is exactly the information the side information provides about the auxiliary variable .
The Gaussian result is surprising: even though the encoder does not know , it can structure its code so that the decoder's side information is just as useful as if the encoder had known it. This fails for general sources β for the binary case, .
Achievability sketch
Generate a random codebook of sequences. Partition them into bins, each of size . The encoder finds a -sequence jointly typical with and sends its bin index. The decoder, knowing , searches the bin for a -sequence jointly typical with . If , the bin is small enough relative to the typical set that the decoder succeeds.
Converse
The converse bounds where is the encoder output, then single-letterizes using the Markov structure.
Gaussian optimality
For jointly Gaussian with where : Set where independent of everything. The MMSE estimate achieves distortion exactly by Gaussian estimation theory. The rate , which matches . The side information does not appear in the rate expression!
Example: Wyner-Ziv for Gaussian Source and Side Information
and with independent of . Compute for and compare with without side information.
Without side information
bit/sample.
With side information (Wyner-Ziv)
For the Gaussian case, . With side information: where (MMSE of given ). bits/sample.
Savings
Side information reduces the rate from 1 bit to 0.5 bits β a 50% savings. The decoder uses the noisy observation to predict with variance , and the encoder only needs to describe the remaining uncertainty.
Quick Check
For a binary source and decoder side information (BSC observation), is always?
Yes, side information at the encoder never helps for any source
No, there is generally a strict loss for non-Gaussian sources
It depends on the correlation between and
Yes, because binning is always optimal
The equality is specific to jointly Gaussian sources with squared-error distortion. For binary sources with Hamming distortion, the gap can be significant.
Why This Matters: Wyner-Ziv in Distributed Sensor Networks
In a sensor network, multiple sensors observe correlated measurements and must compress them for transmission to a fusion center. Wyner-Ziv coding allows each sensor to compress its data without knowing what the other sensors have observed β the fusion center uses all received data as side information when decoding each sensor's message. This is the foundation of distributed source coding, covered in Chapter 7. See also Book telecom, Ch. 11 for the information-theoretic perspective on multi-terminal systems.
Key Takeaway
The Wyner-Ziv theorem characterizes lossy compression when the decoder has correlated side information. The rate is , achieved by binning. For Gaussian sources with squared-error distortion, side information at the encoder is worthless () β a remarkable result with no analogue for general sources. This is the lossy analogue of Slepian-Wolf coding for lossless compression.