Ferkans — Interactive Telecom Tutor

When Side Information is Rate-Limited

In the Wyner-Ziv problem (Chapter 6), the decoder has full access to side information $Y^n$ . In the Slepian-Wolf problem (Chapter 7), both sources are encoded at positive rates. But what about the intermediate case: one encoder compresses $X$ , while a helper who observes correlated $Y$ can send a rate-limited description to the decoder?

This is the source coding with a helper problem. The helper cannot send $Y^n$ in full — it has a rate budget $R_Y$ — but even a partial description of $Y$ can help the decoder reconstruct $X$ . The natural question is: what rate pair $(R_X, R_Y)$ is needed to achieve distortion $D$ ?

This problem connects to Wyner's common information, a measure of the shared structure between $X$ and $Y$ that is more nuanced than mutual information.

Distributed Video Coding Pipeline

A step-by-step walkthrough of the distributed video coding pipeline: low-complexity encoder (DCT + quantization + syndrome computation), side information generation at the decoder, and belief propagation decoding with side information.

Definition:
Source Coding with a Helper

Let $(X, Y) \sim P_{XY}$ with distortion measure $d : \mathcal{X} \times \hat{\mathcal{X}} \to [0, \infty)$ .

Main encoder: observes $X^n$ , sends index $f_X(X^n) \in [1 : 2^{nR_X}]$
Helper encoder: observes $Y^n$ , sends index $f_Y(Y^n) \in [1 : 2^{nR_Y}]$
Decoder: produces $\hat{X}^n = g(f_X(X^n), f_Y(Y^n))$

The rate-distortion region is the closure of rate-distortion triples $(R_X, R_Y, D)$ for which $\mathbb{E}[d(X^n, \hat{X}^n)] \leq D + \epsilon$ for all $\epsilon > 0$ and sufficiently large $n$ .

When $R_Y = 0$ , this reduces to standard rate-distortion. When $R_Y \geq H(Y)$ , the decoder has full side information and we recover the Wyner-Ziv problem. The interesting regime is $0 < R_Y < H(Y)$ .

Definition:
Wyner's Common Information

The Wyner common information between $X$ and $Y$ is defined as:

$C_W(X; Y) = \min_{W : X - W - Y} I(X, Y; W)$

where the minimum is over all random variables $W$ such that $X$ and $Y$ are conditionally independent given $W$ : $P_{XY|W}(x, y | w) = P_{X|W}(x|w) P_{Y|W}(y|w)$ .

Equivalently, $C_W(X; Y) = \min_{W : X \perp Y | W} [H(W)]$ where $W$ has the smallest entropy among all common causes that render $X$ and $Y$ independent.

Wyner's common information satisfies $I(X; Y) \leq C_W(X; Y) \leq \min(H(X), H(Y))$ . It equals $I(X; Y)$ when $(X, Y)$ is jointly Gaussian, and equals $\min(H(X), H(Y))$ when one variable is a deterministic function of the other. Unlike mutual information, common information captures the "shared randomness" that cannot be split — it measures the cost of generating $(X, Y)$ from independent components via a common random variable.

Wyner's common information

A measure of shared structure between two random variables, defined as the minimum rate needed to generate $(X, Y)$ from conditionally independent components. It is at least as large as mutual information and captures aspects of dependence that mutual information misses.

Theorem: Rate Region for Source Coding with a Helper (Lossless Case)

For lossless reconstruction of $X$ ( $D = 0$ , $d$ is Hamming distance), the minimum rate for the main encoder as a function of the helper rate $R_Y$ is:

$R_X(R_Y) = \min_{W: X - Y - W, \, I(Y; W) \leq R_Y} H(X | W)$

In particular:

At $R_Y = 0$ : $R_X = H(X)$ (no help)
At $R_Y \geq H(Y|X)$ : $R_X = H(X|Y)$ (full Slepian-Wolf benefit)

The transition between these extremes depends on the joint distribution $P_{XY}$ .

The helper sends a compressed version $W$ of $Y$ at rate $R_Y$ . The decoder uses $W$ as side information to decode $X$ , needing rate $H(X|W)$ . The question is: which $W$ minimizes $H(X|W)$ subject to the rate constraint $I(Y; W) \leq R_Y$ ?

Intuitively, the helper should focus on sending the parts of $Y$ that are most informative about $X$ . This is precisely the optimization that Wyner's common information addresses: the helper extracts the "common" component between $X$ and $Y$ and sends it to the decoder.

Proof

Achievability

Fix $P_{W|Y}$ satisfying the rate and Markov constraints. The helper uses a Wyner-Ziv-like code to send $W^n$ at rate $I(Y; W)$ . The main encoder uses Slepian-Wolf coding with side information $W$ at the decoder, achieving rate $H(X|W)$ .

Converse

By Fano's inequality and the Markov chain structure, the minimum rate for the main encoder given any helper description of rate $R_Y$ is lower bounded by $H(X|W)$ where $W$ is the best $R_Y$ -rate description of $Y$ for predicting $X$ .

,

Example: Helper for the Doubly Symmetric Binary Source

Let $X \sim \text{Bernoulli}(1/2)$ and $Y = X \oplus Z$ where $Z \sim \text{Bernoulli}(p)$ , $p = 0.1$ . The helper observes $Y$ and can send at rate $R_Y$ . What is the minimum rate $R_X$ for lossless reconstruction of $X$ as a function of $R_Y$ ?

Solution

Boundary values

At $R_Y = 0$ : $R_X = H(X) = 1$ bit
At $R_Y = 1$ : $R_X = H(X|Y) = \mathcal{H}_2(0.1) \approx 0.469$ bits

The full Slepian-Wolf benefit is achieved when $R_Y \geq H(Y|X) = \mathcal{H}_2(0.1) \approx 0.469$ bits.

Intermediate regime

For $0 < R_Y < \mathcal{H}_2(p)$ , the helper sends a partial description of $Y$ . The optimal strategy is to use a "soft" version of $Y$ that preserves the most information about $X$ .

For the DSBS, the rate function $R_X(R_Y)$ is piecewise linear between the two extremes, achieved by time-sharing between the corner points of the Slepian-Wolf region.

Historical Note: Two Notions of Common Information

1975

There are actually two distinct notions of "common information" in information theory, both introduced in 1975. Wyner's common information (discussed here) is the minimum rate to generate the joint distribution from conditionally independent components. Gacs and Korner's common information is the maximum rate of "common randomness" that can be extracted from the joint distribution.

These two quantities can be very different: for the DSBS, Wyner's common information is $1 - \mathcal{H}_2(p)$ (close to 1 for small $p$ ), while Gacs-Korner's common information is 0 (there is no nontrivial common part). The gap between them reflects different operational settings — generation vs. extraction of shared randomness.

Quick Check

In source coding with a helper, what happens when the helper's rate $R_Y$ exceeds $H(Y|X)$ ?

Further increasing $R_Y$ continues to reduce $R_X$

The minimum $R_X$ saturates at $H(X|Y)$

The system becomes equivalent to joint encoding

Correction:

The minimum

R_X

saturates at

H(X|Y)

Correct! Once the helper sends enough to resolve $Y$ (given the decoder also has $X$ 's encoding), the full Slepian-Wolf benefit is achieved.

Common Mistake: Confusing Common Information with Mutual Information

Mistake:

Treating Wyner's common information $C_W(X; Y)$ as if it equals the mutual information $I(X; Y)$ . While $C_W \geq I$ always holds, they can differ significantly — especially for discrete sources.

Correction:

Wyner's common information measures the minimum rate to make $X$ and $Y$ conditionally independent, while mutual information measures the total statistical dependence. For jointly Gaussian variables, they coincide. For discrete variables, $C_W$ can be much larger than $I$ .

Key Takeaway

Source coding with a helper bridges the gap between standard compression ( $R_Y = 0$ ) and Slepian-Wolf coding ( $R_Y \geq H(Y|X)$ ). The optimal helper strategy extracts the "common information" between $X$ and $Y$ , as formalized by Wyner's common information. The rate savings from even a small helper rate can be substantial when the sources are strongly correlated.

Source Coding with a Helper

When Side Information is Rate-Limited

Distributed Video Coding Pipeline

Definition: Source Coding with a Helper

Definition: Wyner's Common Information

Wyner's common information

Theorem: Rate Region for Source Coding with a Helper (Lossless Case)

Achievability

Converse

Example: Helper for the Doubly Symmetric Binary Source

Boundary values

Intermediate regime

Historical Note: Two Notions of Common Information

Quick Check

Common Mistake: Confusing Common Information with Mutual Information

Key Takeaway

Definition:
Source Coding with a Helper

Definition:
Wyner's Common Information