The Gap to Capacity: Coding and Shaping
Splitting the Gap Into Two Pieces
The 7-10 dB gap between an uncoded QAM constellation and the Shannon limit at bandwidth-limited rates turns out to have a rather clean decomposition. Part of the gap is a coding deficit: the constellation has too few points packed too loosely, and a code that places the transmitted vector farther from its nearest neighbors recovers that portion β this is what most of Book CM will concern. Another part is a shaping deficit: a uniform distribution over a square QAM boundary is Gaussian-mismatched, and a non-uniform input or a spherical boundary recovers this portion β up to an ultimate limit of dB.
Now here is the key idea: these two gaps are essentially decoupled. A coded modulation scheme can recover coding gain without any shaping, and shaping can be layered on top of an otherwise uniform-input code without interfering with it. This is the shaping-coding decomposition, and it is what makes probabilistic shaping such a clean add-on to modern LDPC/BICM systems.
Definition: CM Capacity of a Uniform Input Constellation
CM Capacity of a Uniform Input Constellation
Let be a finite signal-space constellation of points and let be the uniform distribution on . Define the CM capacity of at SNR as
where and . This is the mutual information between the uniformly distributed transmitted symbol and its AWGN output, evaluated at the given SNR.
As , ; as , . At moderate SNR, the CM capacity is bounded above by the Shannon capacity , and the gap between them measures the shaping loss of the constellation at that SNR.
Theorem: The Ultimate Shaping Gain: dB
Consider signaling on the real AWGN channel at spectral efficiency bits per 2D dimension. For any finite constellation inscribed in an -dimensional cube (or any rectangular box) and any finite constellation inscribed in a ball (or any sphere-like region), the ratio of the minimum average energies needed to carry the same number of points satisfies
or about dB in decibels. This is the ultimate shaping gain: the asymptotic gap between the energy efficiency of a cubic (product) constellation and a sphere-bounded constellation.
A Gaussian input distribution is the unique maximizer of the mutual information on an AWGN channel under a second-moment constraint. A uniform distribution over a cube is close to i.i.d. uniform on each dimension and carries the same energy per point, but its per-point "volume" is larger than the corresponding ball of the same number of points. The ratio of average energies at equal cardinality, in the limit of many dimensions, is the classic β the factor by which a uniform-in-a-cube distribution is beaten by a Gaussian.
For a uniform distribution on the cube , compute the per-dimension second moment .
For a uniform distribution on a ball of radius in high dimensions, the per-dimension second moment concentrates at .
Match cardinalities by equating volumes: vs. ball volume . Use Stirling's formula to extract the leading behavior.
Per-dimension second moment of cube vs. ball
For a uniform distribution on the cube , the average energy per 2D dimension is . For a uniform distribution on an -ball of radius , the average energy per real dimension is , and per 2D dimension is .
Equal cardinality (equal volume)
Asymptotically, we match constellation sizes by matching continuous-region volumes. The cube has volume . The -ball of radius has volume . Equating and solving for :
Take the ratio of second moments
The ratio of energies at equal cardinality is
Substitute and apply Stirling's formula :
So the ratio tends to per real dimension, and per 2D dimension this doubles (since the cube has energy and the ball has energy per 2D, the two factors of 2 cancel). Carrying through the bookkeeping yields the asymptotic ratio , i.e., dB.
Shannon Capacity vs. Uniform -QAM CM Capacity
The solid line is Shannon capacity; the dashed line is the CM capacity of uniform -QAM. At low SNR the two coincide; at high SNR the CM capacity saturates at . The vertical gap at a fixed rate is the shaping loss; the horizontal gap at a fixed SNR below saturation is the "modulation" loss that a better code alone cannot recover.
Parameters
Definition: Coding and Shaping Gains of a Scheme
Coding and Shaping Gains of a Scheme
Fix a target spectral efficiency and an AWGN channel. Write the total SNR gap from uncoded QAM to Shannon capacity as
The three terms are:
- Coding gain . The difference between the operating SNR of uncoded QAM at the target error probability and the SNR where uniform-input CM capacity equals . It is the portion of the gap recoverable without shaping.
- Shaping gain . The difference between the SNR where uniform-input CM capacity equals and the SNR where Gaussian-input capacity equals . Bounded by dB as .
- Finite-blocklength / implementation gap. The residual loss at any finite codeword length and decoding complexity. Typically 0.3-1.0 dB for modern LDPC/polar codes at useful block lengths.
Decomposition of the Capacity Gap at Rate
A horizontal bar labels the total SNR gap from uncoded QAM to Shannon capacity at the chosen , split into coding gain (large), shaping gain (bounded by 1.53 dB), and the implementation residual. Increase the coding gain slider to see the uncoded-to-CM-capacity gap shrink; increase the shaping gain slider to see the CM-to-Gaussian gap shrink.
Parameters
Example: Accounting for the 7 dB Gap at
Uncoded 16-QAM at bits/2D requires about dB of for , while the Shannon limit at the same is dB. The total gap is dB. Allocate this gap into coding gain, shaping gain, and finite-blocklength loss for a realistic modern system (say, a rate- LDPC code on 256-QAM with probabilistic amplitude shaping).
Compute the uniform-256-QAM CM-capacity SNR at rate $4$
The CM-capacity curve of uniform 256-QAM saturates at bits/2D. At rate it is numerically found to equal at dB. The gap from there to Shannon ( dB) is small: shaping gain dB at is all that a uniform 256-QAM input is short of capacity. (The dB limit is asymptotic; at the uniform QAM is already quite close to Gaussian in every direction.)
Coding gain
The "ideal" coded-modulation scheme with uniform 256-QAM input and infinite blocklength would operate at dB for . Uncoded 16-QAM operates at dB at . The gap is coding gain dB, the majority of the total.
Finite-blocklength residual
A well-designed rate- LDPC code at block length sits about dB above CM capacity at this rate, giving a realistic operating point of dB. Layering on probabilistic shaping recovers most of the dB shaping gain, landing the system at dB β about dB above Shannon, a textbook modern-system value.
Summary
At the gap breaks up roughly as: coding gain dB (recovered by LDPC + dense 256-QAM), shaping gain dB (recovered by probabilistic shaping), finite-blocklength residual dB (unrecoverable without more complexity).
Why Shaping Gain Is So Small at Moderate
The dB ultimate shaping gain is asymptotic in ; at finite it is smaller. For (uncoded QPSK), shaping gain is essentially zero, because QPSK has no room to be shaped β each point is already at the constellation boundary. For (16-QAM or denser), shaping gain is about dB; for , about dB; only as the constellation grows large and the inscribed sphere approaches the Gaussian-typicality ball does the gain approach dB. This is why probabilistic shaping became compelling only with modern high-order QAM (256-QAM and beyond).
Probabilistic Shaping in Modern Standards
Probabilistic amplitude shaping (PAS), due to BΓΆcherer, Steiner, and Schulte, is the modern practical realization of the shaping gain. Instead of uniformly distributing QAM symbols, a distribution matcher produces QAM points according to a Maxwell-Boltzmann density before the LDPC encoder, which then systematically encodes and leaves the amplitude distribution approximately unchanged. DVB-S2X, optical coherent systems (ITU-T G.709), and some 3GPP study items have adopted or considered PAS variants. The engineering point is that shaping composes cleanly with binary coding on top of BICM β see Chapter 19 for the full treatment.
- β’
PAS operates on the amplitude bits of a QAM signal, leaving the sign bits uniform
- β’
Shaping blocklength must be chosen jointly with the binary code rate to match a target rate
- β’
Adaptive shaping requires a feedback path to convey the chosen distribution; in 5G NR this is not yet standardized
Common Mistake: The shaping gap lives at the input distribution, not at the code
Mistake:
Assuming that a stronger binary code will close the last dB to Shannon capacity.
Correction:
The binary code controls only the coding gain; it cannot change the input distribution of the QAM symbols it is mapped to. If the downstream QAM input is uniform, the maximum achievable mutual information is exactly the uniform-input CM capacity, which is (asymptotically) dB below Shannon. Closing this gap requires explicitly shaping the symbol distribution β the code itself cannot do it.
Key Takeaway
Gap = coding gain + shaping gain + finite-blocklength residual. Coding gain is the biggest piece (5-8 dB) and is what most of coded-modulation theory targets; shaping gain is capped at dB and is recovered by non-uniform input distributions; the finite-blocklength residual is the unavoidable cost of finite complexity. Design accordingly.
Shaping gain
The SNR advantage obtainable by using a non-uniform (Gaussian-like) distribution on the signal constellation instead of a uniform one, with ultimate asymptotic value dB.
Related: Coding Gain, Capacity Gap, Probabilistic Shaping
CM capacity
The mutual information for a uniform distribution on a finite signal-space constellation over the AWGN channel. It saturates at at high SNR and equals Shannon capacity at low SNR; at intermediate SNR the gap to Shannon is the shaping loss.
Related: Shaping gain, Mutual Information
Coding Gain vs. Shaping Gain
| Aspect | Coding Gain | Shaping Gain |
|---|---|---|
| What is being changed | The set of transmitted code points (geometry) | The probability distribution over the set |
| Target | Increase minimum Euclidean distance (or distance spectrum) | Match input distribution to Gaussian (maximize differential entropy at fixed ) |
| Typical magnitude | 5-8 dB recoverable in bandwidth-limited regime | Bounded by dB |
| Example technique | Ungerboeck TCM, LDPC + QAM, turbo + QAM | Probabilistic amplitude shaping, Voronoi shaping, shell mapping |
| Dependence on | Approximately constant across | Zero at low , approaches 1.53 dB as |
| Does it expand bandwidth? | No (coded modulation keeps fixed) | No |
Quick Check
An engineer claims their new coding scheme closes the gap to Shannon capacity at bits/2D to 0 dB, using a standard 1024-QAM constellation with equal a-priori symbol probabilities. Is this plausible?
Yes, if the code is powerful enough.
No, because with a uniform-probability QAM input, the CM capacity is bounded strictly below Shannon by the shaping loss β up to dB at large .
Yes, because at bit/2D the shaping loss vanishes.
No, because uncoded 1024-QAM is intrinsically too far from capacity.
The shaping loss is a property of the input distribution, not the code. A uniform QAM input has CM capacity strictly below Shannon; closing this gap requires non-uniform input probabilities (probabilistic shaping) or a non-rectangular constellation boundary (Voronoi shaping). No binary code can eliminate it.
Historical Note: Forney-Trott-Chung and the Dichotomy of Coding vs. Shaping
1989-2000The clean decomposition of the capacity gap into coding and shaping gains crystallized in a series of papers by Forney, Trott, Chung, and collaborators in the 1990s. The insight β originally hidden behind the lattice-coset framework of coset codes and Voronoi constellations β is that the two gains address independent features of the transmitted signal: coding shapes the set of points, shaping shapes their distribution. The Forney-Trott-Chung 2000 paper on sphere-bound-achieving coset codes gave the definitive statement, showing that multilevel coset codes with Voronoi shaping can in principle achieve capacity on the AWGN channel.
Why This Matters: Why 5G NR Does Not (Yet) Include Shaping, but 6G Might
In 5G NR Rel-15/16/17, the uplink and downlink use uniform QAM constellations with LDPC codes over BICM. The dB shaping gap is left on the table because the standardization, buffer management, and rate-matching complexity of probabilistic shaping did not fit the 5G timeline. For 6G (and for coherent optical links, where the business case is clearer), probabilistic shaping is actively under consideration, and pre-standard implementations in DVB-S2X already demonstrate the practical gain. The takeaway: the shaping-coding decomposition we present here is not just a theoretical curiosity β it maps onto a real engineering roadmap.