Ferkans — Interactive Telecom Tutor

The 6 dB Rule

The Gaussian source is the most important continuous source in practice — it models thermal noise, quantization error, and many natural signals. Its rate-distortion function has a beautiful closed form that yields the famous "6 dB per bit" rule: each additional bit of rate halves the distortion. For vector Gaussian sources, the optimal strategy is "reverse waterfilling" — the dual of the waterfilling solution for channel capacity. Understanding these results is essential for anyone working with quantization, compression, or signal processing.

Theorem: Rate-Distortion Function for Gaussian Source

For a Gaussian source $X \sim \mathcal{N}(0, \sigma^2)$ with squared-error distortion $d(x, \hat{x}) = (x - \hat{x})^2$ : $R = \begin{cases} \frac{1}{2}\log_2 \frac{\sigma^2}{D} & \text{if } 0 < D \leq \sigma^2 \\ 0 & \text{if } D > \sigma^2 \end{cases}$ Equivalently, the distortion-rate function is $D(R) = \sigma^2 \cdot 2^{-2R}$ .

Each bit of rate halves the distortion: going from $R$ to $R + 1$ bits reduces $D$ by a factor of 4 (= 6 dB). This is the information-theoretic version of the "6 dB per bit" rule. At zero rate, the best we can do is estimate $X$ by its mean (zero), giving distortion $\sigma^2$ . At infinite rate, distortion approaches zero.

Proof

Optimal test channel

The optimal test channel is $\hat{X} = X + Z$ where $Z \sim \mathcal{N}(0, D)$ is independent of $X$ ? No — we need $\hat{X}$ to be a function of $X$ plus noise. The correct test channel is: $X = \hat{X} + Z$ where $Z \sim \mathcal{N}(0, D)$ independent of $\hat{X}$ , and $\hat{X} \sim \mathcal{N}(0, \sigma^2 - D)$ . This gives $\mathbb{E}[(X - \hat{X})^2] = \mathbb{E}[Z^2] = D$ .

Mutual information

With $\hat{X} \sim \mathcal{N}(0, \sigma^2 - D)$ and $X = \hat{X} + Z$ : $I(X;\hat{X}) = h(X) - h(X|\hat{X}) = h(X) - h(Z)$ $= \frac{1}{2}\log(2\pi e \sigma^2) - \frac{1}{2}\log(2\pi e D) = \frac{1}{2}\log\frac{\sigma^2}{D}.$

Optimality of Gaussian test channel

The Gaussian test channel is optimal because: (i) among all distributions with variance $\sigma^2 - D$ , the Gaussian maximizes the differential entropy $h(\hat{X})$ (this does not help here since we're minimizing $I$ ); (ii) the key is that the Gaussian $Z$ maximizes $h(X|\hat{X}) = h(Z)$ for fixed $\mathbb{E}[Z^2] = D$ . Since $I = h(X) - h(X|\hat{X})$ , maximizing $h(X|\hat{X})$ minimizes $I$ .

Example: Gaussian Rate-Distortion in Practice

A speech signal is modeled as Gaussian with $\sigma^2 = 1$ . Compute the rate needed for SNR values of 20 dB, 40 dB, and 60 dB, where SNR $= \sigma^2/D$ .

Solution

Rate computation

$R = \frac{1}{2}\log_2 \frac{\sigma^2}{D} = \frac{1}{2}\log_2 \text{SNR}$ .

SNR = 20 dB (= 100): $R = \frac{1}{2}\log_2 100 = 3.32$ bits/sample.
SNR = 40 dB (= 10000): $R = \frac{1}{2}\log_2 10000 = 6.64$ bits/sample.
SNR = 60 dB (= $10^6$ ): $R = \frac{1}{2}\log_2 10^6 = 9.97$ bits/sample.

Every 20 dB of SNR costs 3.32 bits. Equivalently, 1 bit $\approx$ 6.02 dB of SNR.

Practical comparison

Telephone-quality speech (8 kHz, 8 bits/sample = 64 kbps) achieves about 48 dB SNR. The rate-distortion limit for 48 dB at 8 kHz would be $\approx 8 \times 3.99 = 32$ kbps. Modern speech codecs (Opus, EVS) achieve near-transparent quality at 16–24 kbps, approaching but not reaching the R-D bound (they exploit speech-specific structure).

Theorem: Reverse Waterfilling for Vector Gaussian Sources

Let $\mathbf{X} \sim \mathcal{N}(\mathbf{0}, \boldsymbol{\Lambda})$ where $\boldsymbol{\Lambda} = \text{diag}(\sigma_1^2, \ldots, \sigma_k^2)$ is diagonal (parallel independent Gaussian sources). The rate-distortion function under total squared-error distortion $D = \sum_{i=1}^k D_i$ is: $R = \sum_{i=1}^k \frac{1}{2}\log_2 \frac{\sigma_i^2}{D_i^*}$ where $D_i^* = \min(\gamma, \sigma_i^2)$ and the reverse waterfilling level $\gamma$ is chosen so that $\sum_{i=1}^k D_i^* = D$ .

Reverse waterfilling allocates more distortion to components with smaller variance. If a component has variance $\sigma_i^2 < \gamma$ , it gets $D_i^* = \sigma_i^2$ (no bits allocated — the component is "drowned" in distortion). Components with $\sigma_i^2 > \gamma$ get $D_i^* = \gamma$ , with the excess $\sigma_i^2 - \gamma$ captured by the code.

This is the opposite of channel waterfilling, where we pour power into strong subchannels. Here we pour distortion into weak components. The connection to transform coding is immediate: apply the KLT (eigendecomposition) to decorrelate, then reverse-waterfill on the eigenvalues.

Proof

Independent optimization

Since the components are independent, $I(\mathbf{X};\hat{\mathbf{X}}) = \sum_i I(X_i;\hat{X}_i)$ and $D = \sum_i D_i$ . The optimization decomposes into minimizing $\sum_i \frac{1}{2}\log(\sigma_i^2/D_i)$ subject to $\sum_i D_i = D$ and $0 < D_i \leq \sigma_i^2$ .

KKT conditions

Lagrangian: $\mathcal{L} = \sum_i \frac{1}{2}\log(\sigma_i^2/D_i) + \lambda(\sum_i D_i - D)$ . $\frac{\partial}{\partial D_i}: -\frac{1}{2D_i} + \lambda = 0$ , giving $D_i^* = \frac{1}{2\lambda} = \gamma$ for active components ( $D_i < \sigma_i^2$ ). Inactive components ( $D_i = \sigma_i^2$ ) are "shut off" — they contribute zero rate and take their full variance as distortion.

Reverse Waterfilling for Parallel Gaussian Sources

Visualize the reverse waterfilling solution for a set of parallel Gaussian sources. Adjust the total distortion budget and observe how distortion is allocated across components. Compare with the forward waterfilling for channel capacity.

Parameters

Total distortion budget

D

2

Number of components4

⚠️Engineering Note

The 6 dB Rule in System Design

The 6 dB/bit rule ( $D = \sigma^2 \cdot 2^{-2R}$ ) is the fundamental benchmark for all quantization and compression system design. A practical quantizer that achieves SNR $= 6R - c$ dB for some constant $c$ is said to be " $c$ dB from the R-D bound." The best practical schemes:

Uniform quantizer + entropy coding: $c \approx 1.53$ dB (Gish-Pierce, 1968)
Lloyd-Max quantizer (no entropy coding): $c \approx 1.2$ dB for large $M$
Lattice quantizer + entropy coding: $c \approx 0.5$ dB (approaching the Gaussian R-D bound)

These gaps quantify the "price" of structured (implementable) coding versus random coding.

Common Mistake: Forward vs. Reverse Waterfilling

Mistake:

Confusing forward waterfilling (channel capacity, pour power into strong subchannels) with reverse waterfilling (rate-distortion, pour distortion into weak components).

Correction:

The two are duals:

Channel (forward): Fill from the bottom up with power. Strong subchannels get more power. Weak subchannels may be shut off.
Source (reverse): Fill from the top down with distortion. Weak components get more distortion. Weak components may be completely discarded.

The reversal comes from maximizing vs. minimizing: channel capacity maximizes rate (use strong subchannels), rate-distortion minimizes rate (sacrifice weak components).

Reverse Waterfilling: Pouring Distortion into Weak Components

Animated comparison of forward waterfilling (channel capacity, pour power into strong subchannels) with reverse waterfilling (rate-distortion, pour distortion into weak components). The waterfilling level

\gamma

rises to show which components are "shut off" at each distortion budget.

Key Takeaway

For a Gaussian source, $R = \frac{1}{2}\log(\sigma^2/D)$ — the "6 dB per bit" rule. For parallel Gaussian sources, reverse waterfilling allocates more distortion to weak components and zero bits to components with variance below the waterfilling level $\gamma$ . This is the dual of channel waterfilling and the foundation of transform coding: apply the KLT to decorrelate, then reverse-waterfill across the transform coefficients.

Gaussian Rate-Distortion and Reverse Waterfilling