Ferkans — Interactive Telecom Tutor

When Lossless Is Not Enough

In Chapter 5 we proved that lossless compression requires at least $H(X)$ bits per symbol. For a continuous source (e.g., a Gaussian), $H(X) = \infty$ in the discrete sense — we would need infinitely many bits to describe each sample exactly. Even for discrete sources with large alphabets (e.g., images with 256 gray levels), the entropy may be impractically high. The solution is lossy compression: we allow the reconstructed output $\hat{X}$ to differ from the source $X$ , as long as the difference (distortion) is below an acceptable threshold. The fundamental question becomes: what is the minimum rate $R$ needed to achieve average distortion at most $D$ ?

Definition:
Distortion Measure

A distortion measure is a function $d : \mathcal{X} \times \hat{\mathcal{X}} \to [0, \infty)$ that quantifies how "bad" a reconstruction $\hat{x}$ is when the source produced $x$ . Common choices:

Hamming distortion (discrete): $d(x, \hat{x}) = \mathbf{1}\{x \neq \hat{x}\}$
Squared error (continuous): $d(x, \hat{x}) = (x - \hat{x})^2$
Absolute error: $d(x, \hat{x}) = |x - \hat{x}|$

For a pair of sequences, the average distortion is $d(\mathbf{x}, \hat{\mathbf{x}}) = \frac{1}{n}\sum_{i=1}^n d(x_i, \hat{x}_i).$

Definition:
Lossy Source Code

A $(2^{nR}, n)$ lossy source code consists of:

An encoder $f : \mathcal{X}^n \to \{1, 2, \ldots, 2^{nR}\}$ mapping source sequences to indices,
A decoder $g : \{1, 2, \ldots, 2^{nR}\} \to \hat{\mathcal{X}}^n$ mapping indices to reconstruction sequences.

The codebook is $\mathcal{C} = \{g(1), g(2), \ldots, g(2^{nR})\} \subset \hat{\mathcal{X}}^n$ , the set of possible reconstructions.

The encoder picks the "closest" codeword to the source sequence (nearest-neighbor encoding), and the decoder outputs it. The rate $R$ determines the codebook size: higher rate means more codewords, better coverage of the source space, and lower distortion.

Definition:
Achievable Rate-Distortion Pair

A rate-distortion pair $(R, D)$ is achievable if there exists a sequence of $(2^{nR}, n)$ codes with $\lim_{n \to \infty} \mathbb{E}\left[\frac{1}{n}\sum_{i=1}^n d(X_i, \hat{X}_i)\right] \leq D.$ The rate-distortion region is the closure of all achievable pairs.

Example: Scalar Quantization as a Simple Lossy Code

A Gaussian source $X \sim \mathcal{N}(0, 1)$ is quantized to $M = 4$ levels using a uniform quantizer with step size $\Delta$ . Compute the distortion as a function of $\Delta$ and find the optimal $\Delta$ .

Solution

Quantizer description

The uniform quantizer with $M = 4$ levels and step size $\Delta$ has decision boundaries at $\{-\Delta, 0, \Delta\}$ and reconstruction points at $\{-3\Delta/2, -\Delta/2, \Delta/2, 3\Delta/2\}$ . The rate is $R = \log_2 4 = 2$ bits/sample.

Distortion computation

The distortion is $D = \mathbb{E}[(X - Q(X))^2]$ where $Q(X)$ is the quantized value. For large $\Delta$ (overload region dominates), $D \approx \sigma^2 = 1$ . For the granular region, $D \approx \Delta^2/12$ (uniform quantization noise). The optimal $\Delta$ balances granular and overload distortion. Numerically, for $M = 4$ : optimal $\Delta \approx 0.98$ , giving $D \approx 0.1175$ .

Comparison with rate-distortion

The Gaussian rate-distortion function (derived in Section 4) gives $D(R) = \sigma^2 2^{-2R} = 2^{-4} = 0.0625$ at $R = 2$ bits. The uniform quantizer achieves $D = 0.1175$ — about 2.7 dB worse than the rate-distortion limit. The gap is closed by non-uniform (Lloyd-Max) quantizers and entropy coding.

Scalar Quantization of a Gaussian Source

Visualize scalar quantization of a Gaussian source. Adjust the number of levels and observe the reconstruction quality, distortion, and rate. Compare uniform and Lloyd-Max quantizers.

Parameters

Number of quantization levels

M

4

Source std dev

\sigma

1

Quantizer type

Common Mistake: Distortion Is Not Error Probability

Mistake:

Confusing the rate-distortion framework with the channel coding framework by treating distortion $D$ as an "error probability."

Correction:

In lossless coding, the goal is $P_e \to 0$ . In lossy coding, the goal is $D \leq D_0$ for a target distortion $D_0 > 0$ . The distortion can be any non-negative function, not just a 0/1 indicator. The rate-distortion function tells us the minimum rate to achieve distortion $D$ , just as capacity tells us the maximum rate with $P_e \to 0$ . The two are dual problems.

Distortion measure

A function $d(x, \hat{x})$ quantifying the cost of representing source symbol $x$ by reconstruction $\hat{x}$ . Common measures: Hamming distance (discrete), squared error (continuous), perceptual metrics (images/audio).

Lossy source code

An encoder-decoder pair that maps source sequences to a codebook of reconstruction sequences, allowing controlled distortion. The rate $R = \frac{1}{n}\log |\mathcal{C}|$ determines the codebook size.

Key Takeaway

Lossy compression allows controlled distortion to reduce the bit rate below the lossless limit. A distortion measure $d(x, \hat{x})$ quantifies reconstruction quality, and the rate-distortion pair $(R, D)$ specifies the tradeoff. Scalar quantization is the simplest lossy code but falls short of the information-theoretic limit — the rate-distortion function, derived in the next sections, gives the exact minimum rate.

The Lossy Compression Problem