Ferkans — Interactive Telecom Tutor

Why Distributed Source Coding?

Consider two sensors measuring correlated quantities — say, temperature at two nearby locations. Each sensor must compress its data independently (they cannot communicate with each other), but a central base station receives both compressed streams and decodes them jointly. The fundamental question is: does the inability to coordinate encoding cost us in compression efficiency?

The remarkable answer, due to Slepian and Wolf (1973), is no — at least in terms of the achievable rate region. Separate encoding with joint decoding achieves exactly the same rates as if the two encoders could cooperate. This is one of the most surprising results in information theory, and the proof technique — random binning — is as fundamental to multiuser source coding as random coding is to channel coding.

Slepian-Wolf Rate Region Animation

Watch the Slepian-Wolf rate region evolve as the correlation parameter

p

changes from near-perfect correlation (

p \approx 0

) to independence (

p = 0.5

). The shaded pentagon shows the achievable rate pairs, and the red dashed line is the sum-rate bound

R_X + R_Y \geq H(X,Y)

.

Definition:
Distributed Source Coding Setup

Let $(X^n, Y^n)$ be $n$ i.i.d. copies of a pair of correlated discrete random variables $(X, Y) \sim P_{XY}$ with finite alphabets $\mathcal{X}$ and $\mathcal{Y}$ .

Encoder 1 observes $X^n$ and produces an index $f_1(X^n) \in [1 : 2^{nR_X}]$ .

Encoder 2 observes $Y^n$ and produces an index $f_2(Y^n) \in [1 : 2^{nR_Y}]$ .

The joint decoder receives both indices and produces estimates $(\hat{X}^n, \hat{Y}^n) = g(f_1(X^n), f_2(Y^n))$ .

A rate pair $(R_X, R_Y)$ is achievable if for every $\epsilon > 0$ and all sufficiently large $n$ , there exist encoders and a decoder such that $\Pr((\hat{X}^n, \hat{Y}^n) \neq (X^n, Y^n)) \leq \epsilon$ .

The Slepian-Wolf rate region $\mathcal{R}_{\text{SW}}$ is the closure of the set of all achievable rate pairs.

Random binning

A proof technique where each source sequence is randomly assigned to one of $2^{nR}$ bins. The decoder uses joint typicality to identify the correct sequences within their respective bins. Random binning is the source coding analogue of random coding for channel coding.

Slepian-Wolf rate region

The set of rate pairs $(R_X, R_Y)$ for which lossless distributed source coding is achievable: $R_X \geq H(X|Y)$ , $R_Y \geq H(Y|X)$ , $R_X + R_Y \geq H(X,Y)$ .

Theorem: Slepian-Wolf Theorem

The Slepian-Wolf rate region for lossless distributed compression of $(X, Y) \sim P_{XY}$ is the set of rate pairs $(R_X, R_Y)$ satisfying:

$R_X \geq H(X|Y)$ $R_Y \geq H(Y|X)$ $R_X + R_Y \geq H(X,Y)$

The point is that separate encoding achieves the same rate region as joint encoding — the only requirement is that decoding be performed jointly.

Think of it this way: if Encoder 2 sends $Y^n$ at rate $H(Y)$ , then the decoder knows $Y^n$ perfectly, and Encoder 1 only needs to send $X^n$ at rate $H(X|Y)$ . The surprise is that Encoder 1 can achieve rate $H(X|Y)$ even though it does not know $Y^n$ — it just needs to send enough information for the decoder (who does know $Y^n$ ) to resolve the ambiguity. Random binning achieves this: each bin contains roughly $2^{nH(X|Y)}$ sequences, and the correlation with $Y^n$ is enough to pick out the right one.

Proof

Achievability via random binning

Fix $\epsilon > 0$ . We prove that any rate pair $(R_X, R_Y)$ in the interior of the Slepian-Wolf region is achievable.

Random binning: Independently and uniformly assign each sequence $x^n \in \mathcal{X}^n$ to a bin $\mathcal{B}_X(x^n) \in [1 : 2^{nR_X}]$ , and each $y^n \in \mathcal{Y}^n$ to a bin $\mathcal{B}_Y(y^n) \in [1 : 2^{nR_Y}]$ .

Encoding: Encoder 1 sends $\mathcal{B}_X(X^n)$ . Encoder 2 sends $\mathcal{B}_Y(Y^n)$ .

Decoding: Given bin indices $(b_X, b_Y)$ , the decoder looks for a unique pair $(\hat{x}^n, \hat{y}^n)$ such that:

$\hat{x}^n \in \mathcal{B}_X^{-1}(b_X)$ (in the correct $X$ -bin),
$\hat{y}^n \in \mathcal{B}_Y^{-1}(b_Y)$ (in the correct $Y$ -bin),
$(\hat{x}^n, \hat{y}^n) \in \mathcal{T}_\epsilon^{(n)}(X, Y)$ (jointly typical).

Error analysis

An error occurs if:

$\mathcal{E}_0$ : $(X^n, Y^n) \notin \mathcal{T}_\epsilon^{(n)}(X, Y)$ — the true pair is not jointly typical.
$\mathcal{E}_1$ : There exists $\tilde{x}^n \neq X^n$ with $\mathcal{B}_X(\tilde{x}^n) = \mathcal{B}_X(X^n)$ and $(\tilde{x}^n, Y^n) \in \mathcal{T}_\epsilon^{(n)}(X, Y)$ .
$\mathcal{E}_2$ : There exists $\tilde{y}^n \neq Y^n$ with $\mathcal{B}_Y(\tilde{y}^n) = \mathcal{B}_Y(Y^n)$ and $(X^n, \tilde{y}^n) \in \mathcal{T}_\epsilon^{(n)}(X, Y)$ .
$\mathcal{E}_3$ : There exist $\tilde{x}^n \neq X^n$ , $\tilde{y}^n \neq Y^n$ in the correct bins with $(\tilde{x}^n, \tilde{y}^n) \in \mathcal{T}_\epsilon^{(n)}(X, Y)$ .

By the AEP, $\Pr(\mathcal{E}_0) \to 0$ .

Bounding the error events

For $\mathcal{E}_1$ : The number of sequences $\tilde{x}^n$ that are jointly typical with $Y^n$ is at most $2^{n(H(X|Y) + \delta(\epsilon))}$ . Each falls in the same bin as $X^n$ with probability $2^{-nR_X}$ . So:

$\Pr(\mathcal{E}_1) \leq 2^{n(H(X|Y) + \delta(\epsilon))} \cdot 2^{-nR_X} = 2^{-n(R_X - H(X|Y) - \delta(\epsilon))}$

This vanishes if $R_X > H(X|Y) + \delta(\epsilon)$ .

Similarly, $\Pr(\mathcal{E}_2) \to 0$ if $R_Y > H(Y|X) + \delta(\epsilon)$ .

For $\mathcal{E}_3$ : The number of jointly typical pairs $(\tilde{x}^n, \tilde{y}^n)$ is at most $2^{n(H(X,Y) + \delta(\epsilon))}$ , each pair falls in the correct bins with probability $2^{-n(R_X + R_Y)}$ , so:

$\Pr(\mathcal{E}_3) \leq 2^{-n(R_X + R_Y - H(X,Y) - \delta(\epsilon))}$

This vanishes if $R_X + R_Y > H(X,Y) + \delta(\epsilon)$ .

Converse via Fano's inequality

Suppose a rate pair $(R_X, R_Y)$ is achievable with $P_e^{(n)} \to 0$ . By Fano's inequality:

$H(X^n | f_1(X^n), f_2(Y^n)) \leq n\epsilon_n$

where $\epsilon_n \to 0$ . Then:

$nR_X \geq H(f_1(X^n)) \geq H(f_1(X^n) | f_2(Y^n))$ $\geq H(X^n | f_2(Y^n)) - n\epsilon_n$ $\geq H(X^n | Y^n) - n\epsilon_n = nH(X|Y) - n\epsilon_n$

The last step uses the i.i.d. property. Dividing by $n$ and letting $n \to \infty$ gives $R_X \geq H(X|Y)$ . By symmetry, $R_Y \geq H(Y|X)$ . The sum-rate bound $R_X + R_Y \geq H(X,Y)$ follows from:

$n(R_X + R_Y) \geq H(X^n, Y^n) - n\epsilon_n = nH(X,Y) - n\epsilon_n$

,

The Slepian-Wolf Surprise

The Slepian-Wolf theorem is genuinely surprising. Consider the corner point $(R_X, R_Y) = (H(X|Y), H(Y))$ . Encoder 1 compresses $X^n$ to $H(X|Y)$ bits per symbol — this is the conditional entropy, the rate you would need if you knew $Y^n$ . But Encoder 1 does not know $Y^n$ ! The trick is that Encoder 1 does not need to know $Y^n$ to compress at this rate — it just needs to send enough information for the decoder, who does know $Y^n$ after decoding Encoder 2's message, to resolve the remaining uncertainty.

Intuitively, what happens is that random binning creates a many-to-one mapping. The bin index, combined with the side information $Y^n$ at the decoder, is enough to uniquely identify $X^n$ among the bin members. This is possible because the correlation between $X$ and $Y$ effectively "narrows down" the search within the bin.

Historical Note: Slepian and Wolf's Landmark Paper

1973

David Slepian and Jack Wolf published their result in 1973, but the ideas had been circulating in Bell Labs for several years before that. The result was initially met with some skepticism — how could separate encoders do as well as joint encoders? The answer lies in the power of joint decoding and the structure that random binning introduces. The paper also introduced the concept of a rate region, which became the standard way to characterize multiuser information-theoretic limits.

The practical significance of Slepian-Wolf coding was not realized for decades. It was not until the early 2000s, when Pradhan and Ramchandran showed that LDPC syndrome codes could implement Slepian-Wolf coding efficiently, that the result found practical applications in distributed sensor networks and distributed video coding.

Example: Slepian-Wolf for Binary Symmetric Sources

Let $X \sim \text{Bernoulli}(1/2)$ and $Y = X \oplus Z$ where $Z \sim \text{Bernoulli}(p)$ is independent of $X$ (so $(X, Y)$ is a doubly symmetric binary source with parameter $p$ ). Compute the Slepian-Wolf rate region and sketch it.

Solution

Compute the relevant entropies

Since $X \sim \text{Bernoulli}(1/2)$ and $Y = X \oplus Z$ :

$H(X) = 1$ bit
$H(Y) = 1$ bit (since $Y$ is also $\text{Bernoulli}(1/2)$ )
$H(X|Y) = H(Z) = \mathcal{H}_2(p)$ (knowing $Y$ leaves only the uncertainty of $Z$ )
$H(Y|X) = \mathcal{H}_2(p)$ (by symmetry)
$H(X,Y) = H(X) + H(Y|X) = 1 + \mathcal{H}_2(p)$

Write the rate region

The Slepian-Wolf region is: $R_X \geq \mathcal{H}_2(p), \quad R_Y \geq \mathcal{H}_2(p), \quad R_X + R_Y \geq 1 + \mathcal{H}_2(p)$

This is a pentagon in the $(R_X, R_Y)$ plane. The corner points are:

$(\mathcal{H}_2(p), 1)$ : Encoder 2 sends $Y$ at full rate, Encoder 1 only sends the "innovation" relative to $Y$
$(1, \mathcal{H}_2(p))$ : the symmetric corner
Any point on the line $R_X + R_Y = 1 + \mathcal{H}_2(p)$ between these corners

When $p = 0$ (perfect correlation), the region degenerates: $R_X + R_Y \geq 1$ , so one encoder can be silent. When $p = 1/2$ (independence), the region is $R_X \geq 1, R_Y \geq 1$ — no compression gain from joint decoding.

Slepian-Wolf Rate Region Explorer

Explore how the Slepian-Wolf rate region changes with the correlation parameter $p$ for the doubly symmetric binary source. The shaded region shows achievable rate pairs.

Parameters

Crossover probability p0.11

BSC parameter controlling correlation between X and Y

Definition:
Random Binning Construction

A random binning scheme at rate $R$ assigns each sequence $x^n \in \mathcal{X}^n$ to a bin $\mathcal{B}(x^n) \in [1 : 2^{nR}]$ uniformly and independently at random.

The bin assignment partitions $\mathcal{X}^n$ into $2^{nR}$ bins, each containing approximately $|\mathcal{X}|^n / 2^{nR} = 2^{n(\log|\mathcal{X}| - R)}$ sequences.

The encoder sends the bin index $\mathcal{B}(x^n)$ , which requires $nR$ bits. The decoder must determine which sequence within the bin was the true source output.

Definition:
Syndrome-Based Slepian-Wolf Coding

For binary sources, Slepian-Wolf coding has an elegant implementation via syndrome coding. Let $\mathbf{H}$ be the parity-check matrix of a linear code.

Encoder: Given $x^n$ , compute the syndrome $s = \mathbf{H} x^n$ and send $s$ (which requires $n - k$ bits if $\mathbf{H}$ is $(n-k) \times n$ ).

Decoder: Given $(s, y^n)$ , find the $\hat{x}^n$ in the coset $\{x^n : \mathbf{H} x^n = s\}$ that is most likely given $y^n$ .

This is equivalent to decoding the error pattern $e^n = x^n \oplus \hat{x}_0^n$ (where $\hat{x}_0^n$ is any member of the coset) using $y^n$ as side information. The rate is $R_X = (n-k)/n = 1 - k/n$ , and reliable decoding requires $R_X \geq H(X|Y)$ .

The syndrome approach reveals a deep duality between channel coding and source coding: the parity-check matrix of a good channel code is a good source coding binning matrix, and vice versa.

Quick Check

For the Slepian-Wolf problem with $(X, Y) \sim P_{XY}$ , which of the following is true about the corner point $(R_X, R_Y) = (H(X|Y), H(Y))$ ?

Encoder 1 must know $Y^n$ to achieve this rate

Encoder 1 compresses as if it had access to $Y^n$ , even though it does not

This point requires joint encoding of $(X^n, Y^n)$

This point is not achievable — only the interior of the region is

Correction:

Encoder 1 compresses as if it had access to

Y^n

, even though it does not

Correct! The random binning argument shows that the same conditional rate $H(X|Y)$ is achievable even without side information at the encoder. The decoder uses $Y^n$ to resolve the bin ambiguity.

Common Mistake: Confusing Slepian-Wolf with Source Coding with Side Information

Mistake:

Students often confuse the Slepian-Wolf problem (both sources must be reconstructed losslessly) with source coding with side information at the decoder (only $X$ needs to be reconstructed, $Y$ is available at the decoder for free). In the latter, the minimum rate is simply $H(X|Y)$ .

Correction:

In Slepian-Wolf, both sources are encoded and both must be reconstructed. The rate region involves constraints on $R_X$ , $R_Y$ , and $R_X + R_Y$ . Source coding with side information at the decoder is the special case where $R_Y$ is "free" (the decoder already has $Y^n$ ), giving the corner point $R_X = H(X|Y)$ .

Common Mistake: Random Binning is a Proof Technique, Not a Construction

Mistake:

Treating random binning as a practical coding scheme. In the proof, the bin assignments are random and known to both encoder and decoder. A specific realization of the binning works with high probability, but finding good deterministic binning schemes is a separate (hard) problem.

Correction:

Random binning proves existence of good codes. Practical implementations use structured codes (e.g., LDPC syndromes) that approximate the random binning behavior. The transition from random to structured codes is analogous to going from random codes to turbo/LDPC codes in channel coding.

🔧Engineering Note

LDPC-Based Slepian-Wolf Coding in Practice

The most practical implementations of Slepian-Wolf coding use LDPC syndrome codes. The idea, introduced by Pradhan and Ramchandran (2003) in the DISCUS framework, is to use the parity-check matrix of a capacity-approaching LDPC code for binning. The encoder sends the syndrome $\mathbf{H}x^n$ , and the decoder uses belief propagation with the side information $Y^n$ to recover $X^n$ .

This approach achieves rates within a fraction of a dB of the Slepian-Wolf limit for symmetric binary sources, and is the basis for distributed video coding standards like DISCOVER.

Practical Constraints

•
LDPC codes must be designed for the specific correlation structure
•
Belief propagation convergence requires careful graph design
•
Rate adaptation is needed when correlation statistics are unknown

Random Binning Visualization

Visualize how random binning partitions source sequences into bins and how joint typicality with the side information resolves bin ambiguity. The plot shows bin assignments and the decoding region for a simple binary example.

Parameters

Block length n6

Length of source sequences

Binning rate R0.5

Rate of the binning scheme (bits per symbol)

Correlation parameter p0.11

Crossover probability of the BSC correlation model

Quick Check

In the Slepian-Wolf problem, the sum-rate constraint is $R_X + R_Y \geq H(X,Y)$ . Why is this not simply $H(X) + H(Y)$ ?

Because the sources are correlated, joint decoding exploits the correlation to reduce the total rate

Because the encoders communicate with each other

Because the decoder uses feedback to the encoders

Correction:

Because the sources are correlated, joint decoding exploits the correlation to reduce the total rate

Correct! Joint decoding allows the decoder to exploit the statistical dependence between $X^n$ and $Y^n$ . The joint entropy $H(X,Y) = H(X) + H(Y) - I(X;Y)$ is less than $H(X) + H(Y)$ by exactly the mutual information.

Why This Matters: Slepian-Wolf Coding in Sensor Networks

Distributed sensor networks are the canonical application of Slepian-Wolf coding. Sensors measure correlated physical quantities (temperature, humidity, pressure) and must transmit their measurements to a fusion center over rate-limited links. The Slepian-Wolf theorem guarantees that the total rate needed is determined by the joint entropy, not the sum of individual entropies — even though the sensors cannot coordinate their compression.

In modern IoT systems, this principle is used in compressed sensing for distributed data acquisition and in cooperative communication protocols where relays compress overheard signals. See Book telecom, Ch. 12 for practical code constructions that approach the Slepian-Wolf limit.

Key Takeaway

The Slepian-Wolf theorem shows that separate encoding of correlated sources incurs no rate penalty compared to joint encoding, provided decoding is performed jointly. The achievable rate region is the set of $(R_X, R_Y)$ satisfying $R_X \geq H(X|Y)$ , $R_Y \geq H(Y|X)$ , $R_X + R_Y \geq H(X,Y)$ . The proof uses random binning for achievability and Fano's inequality for the converse.

Slepian-Wolf Coding

Why Distributed Source Coding?

Slepian-Wolf Rate Region Animation

Definition: Distributed Source Coding Setup

Random binning

Slepian-Wolf rate region

Theorem: Slepian-Wolf Theorem

Achievability via random binning

Error analysis

Bounding the error events

Converse via Fano's inequality

The Slepian-Wolf Surprise

Historical Note: Slepian and Wolf's Landmark Paper

Example: Slepian-Wolf for Binary Symmetric Sources

Compute the relevant entropies

Write the rate region

Slepian-Wolf Rate Region Explorer

Parameters

Definition: Random Binning Construction

Definition: Syndrome-Based Slepian-Wolf Coding

Quick Check

Common Mistake: Confusing Slepian-Wolf with Source Coding with Side Information

Common Mistake: Random Binning is a Proof Technique, Not a Construction

LDPC-Based Slepian-Wolf Coding in Practice

Random Binning Visualization

Parameters

Quick Check

Why This Matters: Slepian-Wolf Coding in Sensor Networks

Key Takeaway

Definition:
Distributed Source Coding Setup

Definition:
Random Binning Construction

Definition:
Syndrome-Based Slepian-Wolf Coding