CCDM and Distribution Matching
From Theory to Practice: Distribution Matching
The point is that Maxwell-Boltzmann shaping (§1) tells us what distribution to target, and Probabilistic Amplitude Shaping (§2) tells us how to compose shaping with FEC — but neither tells us how to TRANSFORM a uniform bit stream into a stream of MB-shaped amplitudes. That is the job of a distribution matcher. The CCDM (constant-composition distribution matcher) is the dominant practical choice, achieving the MB rate asymptotically.
Definition: Distribution Matcher (DM)
Distribution Matcher (DM)
A distribution matcher is an invertible mapping from a uniform binary stream to an amplitude sequence such that the empirical distribution of approximates a target distribution . The DM is INVERTIBLE — the inverse is computed at the receiver after FEC decoding to recover the original bits.
Definition: Constant-Composition Distribution Matcher (CCDM)
Constant-Composition Distribution Matcher (CCDM)
A constant-composition DM (CCDM) selects output sequences of length with the same empirical histogram — the "composition". Given a target distribution on an -ary alphabet, the CCDM chooses counts with and assigns arithmetic-coding rank to each of the legal sequences.
Theorem: CCDM Rate Approaches MB Rate
For a target distribution with entropy , the CCDM of length achieves rate The rate loss is , vanishing as .
Count of legal sequences
The multinomial coefficient counts sequences with the target composition. By Stirling's approximation, .
Rate extraction
Dividing by and taking the floor: .
Convergence
as . The rate loss decays as — a small fraction of a bit per symbol at ( bits/symbol for 16-PAM).
CCDM Output vs Target MB Distribution
Compare the target Maxwell-Boltzmann distribution (green bars) with the actual CCDM output distribution at a chosen block length (blue bars). As grows, the two match more closely — the KL divergence decays as .
Parameters
CCDM Arithmetic Encoding
Complexity: time, memory.The inverse at the receiver reconstructs from by accumulating at each step. Then in binary.
Example: CCDM for 16-QAM at
For 16-QAM per-axis amplitudes (i.e., ), and target with bits/symbol, find the CCDM rate for . Compare with the entropy.
Target counts
, summing to 200.
Multinomial count
sequences.
Rate
bits/symbol.
Rate loss
Entropy - rate = bits/symbol — about 8% loss. At , the loss drops to ~1%.
Common Mistake: CCDM Short-Block Rate Loss Is Real
Mistake:
"CCDM achieves the entropy exactly for any block length."
Correction:
The rate loss is . At the loss can be 5-10% of the target entropy — a meaningful slice of the shaping gain. For URLLC or short-packet applications, this loss limits the achievable shaping gain. Alternatives: shell mapping (Imai-Hirasawa) and hierarchical DM (Amjad-Böcherer 2013) that reduce the short- block loss at increased complexity.
Historical Note: A Line of Distribution-Matching Algorithms
1977–2016Distribution matching has a 40-year history:
- Imai-Hirasawa 1977: shell mapping (sphere-packing lower bound).
- Laroia-Farvardin-Tretter 1994: trellis shaping (convolutional approach).
- Böcherer 2013 (thesis): CCDM as the practical modern algorithm.
- Amjad-Böcherer 2013: hierarchical DM for short blocks.
- Schulte-Böcherer 2016: formal CCDM rate-loss analysis. Today, CCDM is the dominant choice in production optical and 5G research. Hierarchical DMs are used for very short blocks.
Key Takeaway
CCDM is the practical algorithm that realises MB shaping. It converts a uniform bit stream into an amplitude sequence of target composition with rate . The short-block rate loss limits the achievable shaping gain for URLLC; hierarchical DM offers a more complex but tighter alternative.