The Gap to Capacity

What Keeps Us from Shannon's Limit?

Even with capacity-approaching codes (turbo, LDPC, polar), practical systems operate at a gap from the AWGN capacity. This gap has two distinct components:

  1. Coding gain β€” the loss due to using a practical code/decoder rather than the optimal (ML) decoder with a perfect code. Modern codes have reduced this to <0.5< 0.5 dB.
  2. Shaping gain β€” the loss due to using a uniform distribution over a discrete constellation (e.g., QAM) rather than the optimal Gaussian distribution. This accounts for up to 1.53 dB.

Understanding and closing these gaps is the bridge between information theory and system design.

Definition:

Shaping Gain and Shaping Gap

The shaping gap is the loss in power efficiency due to using a uniform distribution over a hypercubic (QAM) constellation rather than a Gaussian distribution. For a lattice code with NSM G(Ξ›)G(\Lambda) and a cubic shaping region (NSM Gcube=1/12G_{\text{cube}} = 1/12):

Ξ³s=GcubeGsphere=1/121/(2Ο€e)=Ο€e6β‰ˆ1.53Β dB,\gamma_s = \frac{G_{\text{cube}}}{G_{\text{sphere}}} = \frac{1/12}{1/(2\pi e)} = \frac{\pi e}{6} \approx 1.53 \text{ dB},

where Gsphere=1/(2Ο€e)G_{\text{sphere}} = 1/(2\pi e) is the NSM of a sphere (the optimal shaping region). The shaping gain is the reduction of this gap achieved by using a non-cubic (more spherical) shaping region.

Shaping gain

The power savings from using a signal constellation whose average power approaches that of a Gaussian distribution, compared to a uniform distribution over a hypercube. The maximum shaping gain is Ο€e/6β‰ˆ1.53\pi e / 6 \approx 1.53 dB.

Related: Probabilistic shaping

Theorem: The 1.53 dB Shaping Gap

For any lattice code with a hypercubic shaping region, the gap to the Gaussian-input AWGN capacity (at high SNR) is at least

Ξ³s=10log⁑10 ⁣(Ο€e6)β‰ˆ1.53Β dB.\gamma_s = 10\log_{10}\!\left(\frac{\pi e}{6}\right) \approx 1.53 \text{ dB}.

This gap can be closed by using a spherical or approximately spherical shaping region, or by probabilistic shaping.

A uniform distribution over a hypercube has higher average power than a Gaussian with the same peak-to-average ratio, because the corners of the hypercube "waste" power. The factor Ο€e/6\pi e/6 is the ratio of the second moment of a uniform cube to that of a Gaussian ball with the same volume.

Definition:

Probabilistic Shaping

Probabilistic shaping (also called probabilistic amplitude shaping, PAS) uses a non-uniform distribution over constellation points to approximate the capacity-achieving Gaussian input distribution.

Instead of selecting each QAM symbol with equal probability 1/M1/M, a distribution matcher maps uniform information bits to constellation symbols with probabilities proportional to eβˆ’Ξ»βˆ£x∣2e^{-\lambda |x|^2} β€” symbols near the origin (low energy) appear more frequently than symbols at the edges (high energy).

At the receiver, the dematching inverts the non-uniform mapping. The net effect is that the transmitted signal has a Gaussian-like amplitude distribution, closing the shaping gap.

Probabilistic shaping

A technique that uses non-uniform signaling (symbol probabilities proportional to eβˆ’Ξ»βˆ£x∣2e^{-\lambda|x|^2}) to approximate the Gaussian input distribution, closing up to 1.53 dB of shaping gap.

Related: Shaping gain

Reverse Concatenation: PAS Architecture

The standard probabilistic amplitude shaping (PAS) architecture introduced by B"ocherer, Steiner, and Schulte (2015) uses a "reverse concatenation" of distribution matching and FEC coding:

  1. Distribution matcher (DM): Maps kk uniform bits to nn shaped amplitudes following a Maxwell-Boltzmann distribution.
  2. Systematic FEC encoder: Encodes the shaped bits (as systematic part) and adds parity bits (transmitted with uniform distribution on the sign bits).
  3. QAM mapper: Combines shaped amplitudes with uniform sign bits.

The key insight is that the FEC code operates on the binary labels, not the shaped distribution, so standard decoders (LDPC, polar) can be used without modification. The shaping gain is achieved independently of the coding gain.

Constellation Shaping: Uniform vs. Shaped

Compare uniform and shaped 64-QAM constellations. The shaped version uses a Maxwell-Boltzmann distribution, placing higher probability on inner points. Observe the reduction in average power for the same minimum distance.

Parameters
64
0.5

Example: Computing the Shaping Gain

For 256-QAM with a Maxwell-Boltzmann distribution P(x)∝eβˆ’Ξ»βˆ£x∣2P(x) \propto e^{-\lambda |x|^2}, compute the average power reduction compared to uniform signaling when Ξ»=0.05\lambda = 0.05 and the minimum distance is fixed.

πŸ”§Engineering Note

Probabilistic Shaping in Modern Systems

Probabilistic shaping has been deployed in commercial optical fiber transceivers since approximately 2018, achieving gains of 0.5–1.0 dB in reach or throughput. The distribution matcher (typically CCDM β€” constant-composition distribution matching) adds minimal complexity and latency.

For wireless, probabilistic shaping is being considered for future releases of 5G NR and 6G, particularly for high-order modulations (256-QAM, 1024-QAM) where the shaping gain is most significant. The main challenge is the interaction between shaping and HARQ retransmissions, which requires careful design of the rate adaptation strategy.

Practical Constraints
  • β€’

    CCDM adds ~0.05 bits/symbol of rate loss

  • β€’

    Shaping gain increases with constellation size: negligible for QPSK, ~1 dB for 256-QAM

  • β€’

    Interaction with HARQ requires careful rate matching design

Common Mistake: Shaping Gain and Coding Gain Are Separate

Mistake:

Thinking that a better code (more coding gain) automatically provides shaping gain, or that shaping somehow replaces coding.

Correction:

Coding gain and shaping gain address different parts of the gap to capacity. Coding gain comes from error correction (reducing the required SNR for a target error rate). Shaping gain comes from matching the input distribution to the capacity-achieving distribution (Gaussian). The two are additive in dB and can be achieved independently via the PAS architecture. A system needs both to fully approach capacity.

Quick Check

What is the maximum shaping gain achievable by changing the constellation distribution (keeping the same lattice)?

3 dB

Ο€e/6β‰ˆ1.53\pi e / 6 \approx 1.53 dB

6 dB

It depends on the SNR