The Multilevel Coding Framework

A Different Way to Marry Code and Modulation

Ungerboeck's TCM builds a single trellis code that walks through a partitioned constellation and uses the expanded Euclidean distance of the partition to pay for the rate expansion. That is one answer to the coded-modulation question. Here is a different one, proposed a full five years earlier by Imai and Hirasawa: instead of one code over the whole constellation, use one binary code per level of the partition tree and combine them by the partition labelling.

The appeal of this idea is modularity. Each level sees a binary channel whose noise level depends on the intra-subset Euclidean distance at that level. Since those distances grow very rapidly as we descend the partition chain (by factors of 22 in squared distance per level for the Ungerboeck chains), the levels are very asymmetric: level 0 is a noisy BPSK-like channel, while the bottom level is essentially noise-free. A code of low rate on the noisy level and a code of nearly full rate on the clean level matches the effort exactly where it is needed.

The construction is called multilevel coding (MLC). Its natural receiver is multistage decoding (MSD), which we treat in the next section but one. The two combine into a framework that we will see is, from an information-theoretic point of view, as good as the constellation itself — no coded-modulation scheme can do better.

,

Definition:

Partition Chain of a Constellation

A partition chain of a constellation XRN\mathcal{X} \subset \mathbb{R}^N with X=M=2L|\mathcal{X}| = M = 2^L is a sequence of refinements

X  =  A0    A1        AL  =  {single point},\mathcal{X} \;=\; \mathcal{A}_0 \;\supset\; \mathcal{A}_1 \;\supset\; \cdots \;\supset\; \mathcal{A}_L \;=\; \{\text{single point}\},

where each Ai+1\mathcal{A}_{i+1} is obtained from Ai\mathcal{A}_i by partitioning each coset of Ai\mathcal{A}_i into two cosets of Ai+1\mathcal{A}_{i+1}. Equivalently, each refinement halves the number of points per coset while (typically) doubling the squared intra-subset distance. Write Ai(b0,,bi1)\mathcal{A}_i^{(b_0, \ldots, b_{i-1})} for the coset of Ai\mathcal{A}_i labelled by the history (b0,,bi1){0,1}i(b_0, \ldots, b_{i-1}) \in \{0,1\}^i.

For Ungerboeck's 8-PSK chain the intra-subset squared distances are d02=(2sin(π/8))20.586d_0^2 = (2\sin(\pi/8))^2 \approx 0.586, d12=2d_1^2 = 2, and d22=4d_2^2 = 4, giving a geometric factor of roughly 3.4×3.4\times between consecutive levels. That is the asymmetry MLC exploits.

,

Definition:

Multilevel Code (MLC) Encoder

Fix a partition chain of X\mathcal{X} of length L=log2ML = \log_2 M. A multilevel code is specified by LL binary component codes C0,C1,,CL1\mathcal{C}_0, \mathcal{C}_1, \ldots, \mathcal{C}_{L-1} (one per level) of rates R0,R1,,RL1R_0, R_1, \ldots, R_{L-1} and a labelling map μ:{0,1}LX\mu : \{0,1\}^L \to \mathcal{X} that sends the binary label (b0,,bL1)(b_0, \ldots, b_{L-1}) to the unique constellation point in AL(b0,,bL1)\mathcal{A}_L^{(b_0, \ldots, b_{L-1})}.

Encoding rule. At each of nn time steps, nn information bits enter level ii at rate RiR_i, are encoded by Ci\mathcal{C}_i into a binary sequence bi(1),,bi(n)b_i^{(1)}, \ldots, b_i^{(n)}, and the LL bit streams are then combined column-by-column by μ\mu:

x(t)  =  μ(b0(t),b1(t),,bL1(t)),t=1,,n.x^{(t)} \;=\; \mu\bigl(b_0^{(t)}, b_1^{(t)}, \ldots, b_{L-1}^{(t)}\bigr), \qquad t = 1, \ldots, n.

The aggregate rate is R=i=0L1RiR = \sum_{i=0}^{L-1} R_i bits per channel use.

The only coupling between levels is through μ\mu. Inside a level, the code Ci\mathcal{C}_i is ordinary binary coding — convolutional, LDPC, polar, whatever is available. This is what makes MLC modular.

,

MLC Encoder Block Diagram

MLC Encoder Block Diagram
The MLC encoder. Each of the LL levels has its own binary encoder; the outputs are combined column-by-column by the partition-based labelling map μ\mu. The channel sees a sequence of MM-ary constellation points.

Binary Sub-Channel Capacities at Each Level

Each level of the partition chain defines a binary sub-channel. Its capacity depends on the intra-subset minimum distance at that level and on the operating SNR. The plot shows CiC_i for i=0,1,,L1i = 0, 1, \ldots, L-1 at a chosen SNR, for three standard constellations. Notice how much higher the deeper-level capacities are: the bottom level is typically near 11 bit while the top level is often well below.

Parameters
8

Definition:

Partition-Based (Ungerboeck) Labelling

A labelling μ:{0,1}LX\mu : \{0,1\}^L \to \mathcal{X} is partition-based (or Ungerboeck) if for every prefix (b0,,bi1)(b_0, \ldots, b_{i-1}), the preimage

μ1(Ai(b0,,bi1))  =  {(b0,,bi1)}×{0,1}Li\mu^{-1}\bigl(\mathcal{A}_i^{(b_0, \ldots, b_{i-1})}\bigr) \;=\; \{(b_0, \ldots, b_{i-1})\} \times \{0,1\}^{L-i}

— that is, the first ii label bits exactly index the level-ii coset. This is the labelling obtained by walking the set-partitioning tree from root to leaf and reading off the binary branch chosen at each node. It is not Gray labelling: adjacent label vectors may be at the minimum distance dL1d_{L-1} from each other, not maximally separated.

Partition-based labelling is the natural choice for MLC because it makes the first ii decoded bits pick out the level-ii coset, which is exactly what a genie-aided decoder of level ii needs. Gray labelling, by contrast, is the natural choice for BICM — we will revisit this in s04.

,

Theorem: Coset Decoding from the First ii Bits

Under partition-based labelling, knowledge of the bits (b0,,bi1)(b_0, \ldots, b_{i-1}) uniquely determines the level-ii coset Ai(b0,,bi1)\mathcal{A}_i^{(b_0, \ldots, b_{i-1})}, and the remaining label bits (bi,,bL1)(b_i, \ldots, b_{L-1}) index the 2Li2^{L-i} points inside that coset. Consequently, after the first ii levels have been decoded correctly, the problem at level ii is a binary detection problem over a signal set with intra-set minimum distance did_i (the level-ii partition distance).

The partition tree is a binary tree of depth LL; the first ii decoded bits tell us which sub-tree of depth LiL-i the transmitted point lies in. Inside that sub-tree, the nearest-neighbour spacing is the level-ii distance did_i, which is strictly larger than d0=dmind_0 = d_{\min}.

Example: Partition-Based Labelling for 8-PSK

Consider 8-PSK with points xm=ejπm/4x_m = e^{j \pi m / 4}, m=0,1,,7m = 0, 1, \ldots, 7. Construct the Ungerboeck partition chain A0A1A2A3\mathcal{A}_0 \supset \mathcal{A}_1 \supset \mathcal{A}_2 \supset \mathcal{A}_3 and write out the partition-based label for each point. Compute the intra-subset squared distance at each level.

TCM and MLC — Two Answers to the Same Question

It is worth pausing to compare TCM (Chapter 2) and MLC (this chapter) side by side. Both schemes use the same partition chain and the same signal set. The difference is the code:

  • TCM uses a single convolutional (or trellis) code over the levels that need protection (typically the lowest kk levels), and the remaining levels are uncoded. The code lives in the super-alphabet of subsets, not on any individual binary channel.
  • MLC uses separate binary codes, one per level. Each code operates over its own binary channel at its own rate.

Both schemes realise the coding-gain criterion dmindfreed_{\min} \to d_{\rm free} by trading rate at lower levels for distance. The difference in the receiver is equally sharp: TCM uses a single Viterbi decoder over the joint trellis; MLC uses LL separate decoders, one per level, in sequence. This is multistage decoding, to which we turn in s03.

,

Common Mistake: The levels are independent at the encoder, not at the decoder

Mistake:

Asserting that MLC is "LL independent binary codes running in parallel" and therefore that its performance is determined by the worst binary code.

Correction:

The encoders are indeed independent — each level sees a separate binary code. But the decoders are not independent: multistage decoding (which achieves the capacity rule) decodes the levels sequentially, and each stage conditions on the previously decoded bits. If the decoding were done independently level by level, the receiver would see the unconditional mutual information I(Y;Bi)I(Y; B_i) at each level, which is exactly the BICM capacity — strictly less than the CM capacity in general. The conditioning is precisely what closes the CM vs BICM gap.

Multilevel coding (MLC)

A coded-modulation construction in which L=log2ML = \log_2 M binary codes, one per level of a partition chain of the constellation, are concatenated with a partition-based labelling map. Due to Imai and Hirasawa (1977).

Related: Partition Chain of a Constellation, MSD Achieves the CM Capacity, The Capacity Rule for MLC

Partition chain

A nested sequence of refinements of a constellation in which each step halves the number of points per coset while increasing the intra-subset minimum distance. The Ungerboeck chain is the canonical example.

Related: Multilevel Code (MLC) Encoder, Set Partitioning, TCM and MLC — Two Answers to the Same Question

Quick Check

For the Ungerboeck partition chain of 8-PSK at unit EsE_s, what is the ratio d22/d02d_2^2 / d_0^2 between the level-2 and level-0 intra-subset squared distances?

about 2

about 3.4

about 6.8

exactly 4

Key Takeaway

MLC is a modular coded-modulation construction. Each of the L=log2ML = \log_2 M levels of a partition chain gets its own binary code. The encoder combines the LL code streams column-by-column via a partition-based labelling map, producing a stream of constellation symbols. The construction is flexible — any binary codes can be plugged in — and makes the asymmetry of the binary sub-channels (noisy at the top, near-noise-free at the bottom) available for exploitation by the rate-allocation rule of the next section.