The Berger-Tung Inner Bound

From Lossless to Lossy Distributed Coding

The Slepian-Wolf theorem solves the distributed lossless source coding problem completely. But what if we allow distortion? Two encoders observe correlated sources XnX^n and YnY^n, compress them separately, and a joint decoder must reconstruct both within prescribed distortion levels. This is the distributed lossy source coding problem, and it is fundamentally harder than Slepian-Wolf.

Unlike the lossless case, the complete rate-distortion region for distributed lossy coding is unknown in general. The best known achievable region is the Berger-Tung inner bound, which combines random binning with rate-distortion codebooks. We study this bound, its tightness conditions, and the important special case of the CEO problem.

CEO Problem: Sum Rate vs. Distortion

The CEO problem animated: as the number of agents KK increases, the minimum distortion improves but at diminishing returns per agent. Each curve shows the sum-rate--distortion tradeoff for a different number of agents.

Definition:

Distributed Lossy Source Coding

Let (X,Y)∼PXY(X, Y) \sim P_{XY} with distortion measures dX:XΓ—X^β†’[0,∞)d_X : \mathcal{X} \times \hat{\mathcal{X}} \to [0, \infty) and dY:YΓ—Y^β†’[0,∞)d_Y : \mathcal{Y} \times \hat{\mathcal{Y}} \to [0, \infty).

A (2nRX,2nRY,n)(2^{nR_X}, 2^{nR_Y}, n) distributed lossy source code consists of:

  • Encoder 1: f1:Xnβ†’[1:2nRX]f_1 : \mathcal{X}^n \to [1 : 2^{nR_X}]
  • Encoder 2: f2:Ynβ†’[1:2nRY]f_2 : \mathcal{Y}^n \to [1 : 2^{nR_Y}]
  • Decoder: g:[1:2nRX]Γ—[1:2nRY]β†’X^nΓ—Y^ng : [1 : 2^{nR_X}] \times [1 : 2^{nR_Y}] \to \hat{\mathcal{X}}^n \times \hat{\mathcal{Y}}^n

A rate-distortion tuple (RX,RY,DX,DY)(R_X, R_Y, D_X, D_Y) is achievable if for every Ο΅>0\epsilon > 0 and sufficiently large nn, there exists a code with: E[1nβˆ‘i=1ndX(Xi,X^i)]≀DX+Ο΅,E[1nβˆ‘i=1ndY(Yi,Y^i)]≀DY+Ο΅\mathbb{E}\left[\frac{1}{n}\sum_{i=1}^n d_X(X_i, \hat{X}_i)\right] \leq D_X + \epsilon, \quad \mathbb{E}\left[\frac{1}{n}\sum_{i=1}^n d_Y(Y_i, \hat{Y}_i)\right] \leq D_Y + \epsilon

Definition:

The Berger-Tung Inner Bound

The Berger-Tung achievable rate region for distortion pair (DX,DY)(D_X, D_Y) is the set of rate pairs (RX,RY)(R_X, R_Y) satisfying:

RXβ‰₯I(X;U∣V)R_X \geq I(X; U | V) RYβ‰₯I(Y;V∣U)R_Y \geq I(Y; V | U) RX+RYβ‰₯I(X,Y;U,V)R_X + R_Y \geq I(X, Y; U, V)

for some conditional distributions PU∣XP_{U|X} and PV∣YP_{V|Y} (with Uβˆ’Xβˆ’Yβˆ’VU - X - Y - V forming a Markov chain) and reconstruction functions x^(U,V)\hat{x}(U, V), y^(U,V)\hat{y}(U, V) such that:

E[dX(X,x^(U,V))]≀DX,E[dY(Y,y^(U,V))]≀DY\mathbb{E}[d_X(X, \hat{x}(U, V))] \leq D_X, \quad \mathbb{E}[d_Y(Y, \hat{y}(U, V))] \leq D_Y

The auxiliary random variables UU and VV play the role of compressed descriptions of XX and YY, respectively.

The Berger-Tung bound combines two ideas: rate-distortion codebooks (to introduce the auxiliary variables U,VU, V) and Slepian-Wolf binning (to exploit the correlation between UnU^n and VnV^n at the decoder). The bound is known to be tight in several important special cases, but whether it is tight in general remains an open problem.

,

Berger-Tung inner bound

The best known achievable rate-distortion region for distributed lossy source coding. It combines rate-distortion codebooks with random binning. The bound is tight for the quadratic Gaussian CEO problem and for certain symmetric sources.

Related: Distributed Source Coding Setup, Rate-distortion theory and the information bottlen…, Random binning

Theorem: Achievability of the Berger-Tung Bound

The rate-distortion tuples in the Berger-Tung region are achievable. Specifically, for any (RX,RY)(R_X, R_Y) in the interior of the Berger-Tung region for distortion (DX,DY)(D_X, D_Y), there exists a sequence of distributed lossy source codes achieving these rates and distortions.

The proof combines two techniques we have already seen:

  1. Covering: Each encoder generates a rate-distortion codebook of auxiliary sequences. Encoder 1 finds a UnU^n in its codebook that is jointly typical with XnX^n (this is the covering lemma at work). Similarly, Encoder 2 finds a VnV^n jointly typical with YnY^n.

  2. Binning: The auxiliary sequences UnU^n and VnV^n are correlated (through the Markov chain Uβˆ’Xβˆ’Yβˆ’VU - X - Y - V). Instead of sending the full codebook indices, each encoder sends only a bin index β€” the decoder uses joint typicality of (Un,Vn)(U^n, V^n) to resolve the bin ambiguity, just as in Slepian-Wolf.

The point is that we get the rate savings of both lossy compression (through the auxiliary codebooks) and distributed coding (through binning).

Definition:

The CEO Problem

The CEO problem (or indirect distributed source coding) models a scenario where KK agents observe noisy versions of a single source XX:

Yk=X+Nk,k=1,…,KY_k = X + N_k, \quad k = 1, \ldots, K

where N1,…,NKN_1, \ldots, N_K are independent noise terms. The agents separately encode their observations and send them to a "CEO" (central decoder) who must reconstruct XX within distortion DD.

The CEO must determine: what is the minimum total rate Rsum=R1+β‹―+RKR_{\text{sum}} = R_1 + \cdots + R_K needed to achieve distortion DD?

This is a special case of the distributed lossy source coding problem where:

  • Only one underlying source XX needs to be reconstructed
  • The observations YkY_k are conditionally independent given XX
  • The distortion constraint is on E[d(X,X^)]\mathbb{E}[d(X, \hat{X})]

CEO problem

A distributed source coding problem where multiple agents observe noisy versions of a single underlying source and must compress their observations for a central decoder to reconstruct the original source. Named by Berger, Zhang, and Viswanathan (1996) as an analogy to a corporate CEO receiving reports from multiple department heads.

Related: Distributed Source Coding Setup, Berger-Tung inner bound, Indirect rate-distortion

Theorem: Quadratic Gaussian CEO Problem

Let X∼N(0,ΟƒX2)X \sim \mathcal{N}(0, \sigma_X^2) and Yk=X+NkY_k = X + N_k where Nk∼N(0,ΟƒN2)N_k \sim \mathcal{N}(0, \sigma_N^2) are i.i.d., independent of XX. Under squared-error distortion, the minimum sum rate to achieve distortion DD is:

Rsum(D)={K2log⁑σNβˆ’2+ΟƒXβˆ’2Dβˆ’1βˆ’ΟƒXβˆ’2+KΟƒNβˆ’2βˆ’(Kβˆ’1)(Dβˆ’1βˆ’ΟƒXβˆ’2)ifΒ Dmin⁑≀D≀Dmax⁑0ifΒ D>Dmax⁑R_{\text{sum}}(D) = \begin{cases} \frac{K}{2}\log\frac{\sigma_N^{-2} + \sigma_X^{-2}}{D^{-1} - \sigma_X^{-2} + K\sigma_N^{-2} - (K-1)(D^{-1} - \sigma_X^{-2})} & \text{if } D_{\min} \leq D \leq D_{\max} \\ 0 & \text{if } D > D_{\max} \end{cases}

where Dmin⁑=1ΟƒXβˆ’2+KΟƒNβˆ’2D_{\min} = \frac{1}{\sigma_X^{-2} + K\sigma_N^{-2}} is the MMSE with infinite rate, and Dmax⁑=ΟƒX2D_{\max} = \sigma_X^2 is the distortion with zero rate.

The Berger-Tung bound is tight for this case, and the sum-rate has a water-filling interpretation over the "information dimensions" of the observations. As Kβ†’βˆžK \to \infty, the minimum distortion approaches Dmin⁑=1ΟƒXβˆ’2+KΟƒNβˆ’2β†’0D_{\min} = \frac{1}{\sigma_X^{-2} + K\sigma_N^{-2}} \to 0, but the rate per agent remains bounded β€” each agent contributes diminishing but nonzero marginal information about XX.

,

Example: Two-Agent Gaussian CEO Problem

Consider the CEO problem with K=2K = 2 agents, X∼N(0,1)X \sim \mathcal{N}(0, 1), and Yk=X+NkY_k = X + N_k where Nk∼N(0,1)N_k \sim \mathcal{N}(0, 1) for k=1,2k = 1, 2. Find the minimum sum rate R1+R2R_1 + R_2 to achieve distortion D=0.4D = 0.4 under squared-error distortion.

CEO Problem: Sum Rate vs. Distortion

Explore how the minimum sum rate varies with the target distortion and the number of agents in the Gaussian CEO problem.

Parameters
3

Number of distributed encoders

1

Standard deviation of the source X

1

Standard deviation of the observation noise

Distributed vs. Joint Source Coding

PropertyJoint EncodingDistributed (Slepian-Wolf / Berger-Tung)
Encoder cooperationFull: encoders can communicateNone: encoders operate independently
Lossless rate regionRX+RYβ‰₯H(X,Y)R_X + R_Y \geq H(X,Y)Same! (Slepian-Wolf theorem)
Lossy rate regionKnown exactly (joint rate-distortion)Berger-Tung inner bound (tight in some cases)
Proof techniqueSingle-user rate-distortionRandom binning + covering lemma
Practical codesStandard compression (JPEG, H.264)LDPC syndromes, distributed video coding
Key open problemNone (fully solved)Is Berger-Tung tight in general?

Common Mistake: The Berger-Tung Bound is Not Always Tight

Mistake:

Assuming the Berger-Tung inner bound gives the exact rate-distortion region for all distributed lossy source coding problems.

Correction:

The Berger-Tung bound is an inner bound (achievable region) but it is not known to be tight in general. It is tight for specific cases: the quadratic Gaussian CEO problem, the Gaussian two-terminal source coding problem with individual distortion constraints, and certain symmetric sources. The general tightness remains one of the important open problems in network information theory.

⚠️Engineering Note

CEO Problem in Wireless Sensor Networks

The CEO problem directly models wireless sensor networks: multiple sensors observe noisy versions of a physical quantity and transmit compressed observations to a fusion center. The key engineering insight is that the sum rate scales as O(log⁑K)O(\log K) while the distortion decreases as O(1/K)O(1/K), so adding more sensors improves quality at a logarithmic rate cost.

In practice, the agents often communicate over noisy channels (not noiseless rate-limited links), leading to the joint source-channel CEO problem. The separation principle does not apply in this multiuser setting, making the problem significantly harder. See Chapter 8 for further discussion.

Key Takeaway

The Berger-Tung inner bound generalizes Slepian-Wolf to the lossy setting by combining rate-distortion codebooks with random binning. Each encoder uses a codebook to create a compressed description (auxiliary variable), then uses binning to exploit the correlation at the decoder. The bound is tight for the Gaussian CEO problem, where the sum rate has a clean closed-form expression, but its general tightness remains open.