Exercises

ex-ch08-01

Easy

For jointly Gaussian (X,Y)(X, Y) with correlation coefficient ρ\rho, show that Wyner's common information equals the mutual information: CW(X;Y)=I(X;Y)=12log⁑11βˆ’Ο2C_W(X; Y) = I(X; Y) = \frac{1}{2}\log\frac{1}{1-\rho^2}.

ex-ch08-02

Medium

For the doubly symmetric binary source (X∼Bernoulli(1/2)X \sim \text{Bernoulli}(1/2), Y=XβŠ•ZY = X \oplus Z, Z∼Bernoulli(p)Z \sim \text{Bernoulli}(p)), show that Wyner's common information is CW(X;Y)=1βˆ’H2(p)C_W(X; Y) = 1 - \mathcal{H}_2(p), which is strictly greater than I(X;Y)=1βˆ’H2(p)I(X; Y) = 1 - \mathcal{H}_2(p) when... Actually, verify they are equal for the DSBS.

ex-ch08-03

Medium

Show that the information bottleneck at β=1\beta = 1 reduces to minimizing I(X;T∣Y)I(X; T | Y), the information in TT about XX that is not relevant to YY.

ex-ch08-04

Medium

Show that the VAE KL term EPdata[D(qΟ•(z∣x)βˆ₯p(z))]\mathbb{E}_{P_{\text{data}}}[D(q_\phi(z|x) \| p(z))] is an upper bound on I(X;Z)I(X; Z) under the variational distribution.

ex-ch08-05

Easy

In a DVC system using LDPC-based Slepian-Wolf coding, the BSC correlation model has crossover probability p=0.05p = 0.05. What is the minimum syndrome rate RsynR_{\text{syn}} needed, and what LDPC code rate should be used?

ex-ch08-06

Hard

Derive the self-consistent equations for the information bottleneck by taking the functional derivative of LIB=I(X;T)βˆ’Ξ²β‹…I(T;Y)\mathcal{L}_{\text{IB}} = I(X; T) - \beta \cdot I(T; Y) with respect to PT∣X(t∣x)P_{T|X}(t|x), subject to the normalization constraint βˆ‘tPT∣X(t∣x)=1\sum_t P_{T|X}(t|x) = 1.

ex-ch08-07

Hard

Show that for a Ξ²\beta-VAE with Gaussian encoder qΟ•(z∣x)=N(ΞΌΟ•(x),σϕ2(x)I)q_\phi(z|x) = \mathcal{N}(\mu_\phi(x), \sigma_\phi^2(x)I) and standard Gaussian prior p(z)=N(0,I)p(z) = \mathcal{N}(0, I), the KL term has the closed-form expression: D(qΟ•(z∣x)βˆ₯p(z))=12βˆ‘j=1d[ΞΌΟ•,j2(x)+σϕ,j2(x)βˆ’1βˆ’log⁑σϕ,j2(x)]D(q_\phi(z|x) \| p(z)) = \frac{1}{2}\sum_{j=1}^d \left[\mu_{\phi,j}^2(x) + \sigma_{\phi,j}^2(x) - 1 - \log\sigma_{\phi,j}^2(x)\right]

ex-ch08-08

Medium

A source coding system with a helper has X∼Uniform({0,1,2,3})X \sim \text{Uniform}(\{0,1,2,3\}) and Y=Xβ€Šmodβ€Š2Y = X \bmod 2 (the helper observes only the parity of XX). What is the minimum rate RXR_X as a function of the helper rate RYR_Y?

ex-ch08-09

Challenge

Consider the information bottleneck for binary classification: X∈{0,1}dX \in \{0,1\}^d is a feature vector, Y∈{0,1}Y \in \{0,1\} is a binary label, and T∈{0,1}T \in \{0,1\} is a binary compressed representation. Show that the optimal PT∣XP_{T|X} at any Ξ²\beta is a soft clustering that groups inputs with similar posterior PY∣X(β‹…βˆ£x)P_{Y|X}(\cdot|x).

ex-ch08-10

Medium

In the DISCUS framework for Slepian-Wolf coding, explain why the syndrome decoding problem Hen=sβ€²\mathbf{H}e^n = s' with en∼i.i.d.Β Bernoulli(p)e^n \sim \text{i.i.d. Bernoulli}(p) is equivalent to channel decoding on a BSC(pp).