Ferkans — Interactive Telecom Tutor

A Different Way to Marry Code and Modulation

Ungerboeck's TCM builds a single trellis code that walks through a partitioned constellation and uses the expanded Euclidean distance of the partition to pay for the rate expansion. That is one answer to the coded-modulation question. Here is a different one, proposed a full five years earlier by Imai and Hirasawa: instead of one code over the whole constellation, use one binary code per level of the partition tree and combine them by the partition labelling.

The appeal of this idea is modularity. Each level sees a binary channel whose noise level depends on the intra-subset Euclidean distance at that level. Since those distances grow very rapidly as we descend the partition chain (by factors of $2$ in squared distance per level for the Ungerboeck chains), the levels are very asymmetric: level 0 is a noisy BPSK-like channel, while the bottom level is essentially noise-free. A code of low rate on the noisy level and a code of nearly full rate on the clean level matches the effort exactly where it is needed.

The construction is called multilevel coding (MLC). Its natural receiver is multistage decoding (MSD), which we treat in the next section but one. The two combine into a framework that we will see is, from an information-theoretic point of view, as good as the constellation itself — no coded-modulation scheme can do better.

,

Definition:
Partition Chain of a Constellation

A partition chain of a constellation $\mathcal{X} \subset \mathbb{R}^N$ with $|\mathcal{X}| = M = 2^L$ is a sequence of refinements

$\mathcal{X} \;=\; \mathcal{A}_0 \;\supset\; \mathcal{A}_1 \;\supset\; \cdots \;\supset\; \mathcal{A}_L \;=\; \{\text{single point}\},$

where each $\mathcal{A}_{i+1}$ is obtained from $\mathcal{A}_i$ by partitioning each coset of $\mathcal{A}_i$ into two cosets of $\mathcal{A}_{i+1}$ . Equivalently, each refinement halves the number of points per coset while (typically) doubling the squared intra-subset distance. Write $\mathcal{A}_i^{(b_0, \ldots, b_{i-1})}$ for the coset of $\mathcal{A}_i$ labelled by the history $(b_0, \ldots, b_{i-1}) \in \{0,1\}^i$ .

For Ungerboeck's 8-PSK chain the intra-subset squared distances are $d_0^2 = (2\sin(\pi/8))^2 \approx 0.586$ , $d_1^2 = 2$ , and $d_2^2 = 4$ , giving a geometric factor of roughly $3.4\times$ between consecutive levels. That is the asymmetry MLC exploits.

,

Definition:
Multilevel Code (MLC) Encoder

Fix a partition chain of $\mathcal{X}$ of length $L = \log_2 M$ . A multilevel code is specified by $L$ binary component codes $\mathcal{C}_0, \mathcal{C}_1, \ldots, \mathcal{C}_{L-1}$ (one per level) of rates $R_0, R_1, \ldots, R_{L-1}$ and a labelling map $\mu : \{0,1\}^L \to \mathcal{X}$ that sends the binary label $(b_0, \ldots, b_{L-1})$ to the unique constellation point in $\mathcal{A}_L^{(b_0, \ldots, b_{L-1})}$ .

Encoding rule. At each of $n$ time steps, $n$ information bits enter level $i$ at rate $R_i$ , are encoded by $\mathcal{C}_i$ into a binary sequence $b_i^{(1)}, \ldots, b_i^{(n)}$ , and the $L$ bit streams are then combined column-by-column by $\mu$ :

$x^{(t)} \;=\; \mu\bigl(b_0^{(t)}, b_1^{(t)}, \ldots, b_{L-1}^{(t)}\bigr), \qquad t = 1, \ldots, n.$

The aggregate rate is $R = \sum_{i=0}^{L-1} R_i$ bits per channel use.

The only coupling between levels is through $\mu$ . Inside a level, the code $\mathcal{C}_i$ is ordinary binary coding — convolutional, LDPC, polar, whatever is available. This is what makes MLC modular.

,

MLC Encoder Block Diagram — The MLC encoder. Each of the $L$ levels has its own binary encoder; the outputs are combined column-by-column by the partition-based labelling map $\mu$ . The channel sees a sequence of $M$ -ary constellation points.

Binary Sub-Channel Capacities at Each Level

Each level of the partition chain defines a binary sub-channel. Its capacity depends on the intra-subset minimum distance at that level and on the operating SNR. The plot shows $C_i$ for $i = 0, 1, \ldots, L-1$ at a chosen SNR, for three standard constellations. Notice how much higher the deeper-level capacities are: the bottom level is typically near $1$ bit while the top level is often well below.

Parameters

Constellation

SNR [dB]8

Definition:
Partition-Based (Ungerboeck) Labelling

A labelling $\mu : \{0,1\}^L \to \mathcal{X}$ is partition-based (or Ungerboeck) if for every prefix $(b_0, \ldots, b_{i-1})$ , the preimage

$\mu^{-1}\bigl(\mathcal{A}_i^{(b_0, \ldots, b_{i-1})}\bigr) \;=\; \{(b_0, \ldots, b_{i-1})\} \times \{0,1\}^{L-i}$

— that is, the first $i$ label bits exactly index the level- $i$ coset. This is the labelling obtained by walking the set-partitioning tree from root to leaf and reading off the binary branch chosen at each node. It is not Gray labelling: adjacent label vectors may be at the minimum distance $d_{L-1}$ from each other, not maximally separated.

Partition-based labelling is the natural choice for MLC because it makes the first $i$ decoded bits pick out the level- $i$ coset, which is exactly what a genie-aided decoder of level $i$ needs. Gray labelling, by contrast, is the natural choice for BICM — we will revisit this in s04.

,

Theorem: Coset Decoding from the First $i$ Bits

Under partition-based labelling, knowledge of the bits $(b_0, \ldots, b_{i-1})$ uniquely determines the level- $i$ coset $\mathcal{A}_i^{(b_0, \ldots, b_{i-1})}$ , and the remaining label bits $(b_i, \ldots, b_{L-1})$ index the $2^{L-i}$ points inside that coset. Consequently, after the first $i$ levels have been decoded correctly, the problem at level $i$ is a binary detection problem over a signal set with intra-set minimum distance $d_i$ (the level- $i$ partition distance).

The partition tree is a binary tree of depth $L$ ; the first $i$ decoded bits tell us which sub-tree of depth $L-i$ the transmitted point lies in. Inside that sub-tree, the nearest-neighbour spacing is the level- $i$ distance $d_i$ , which is strictly larger than $d_0 = d_{\min}$ .

Show Hint

The partition chain is a nested sequence of cosets. Map the first $i$ label bits to the coset at depth $i$ .

Count the remaining points: each node at depth $i$ has $2^{L-i}$ descendants at the leaves.

The minimum distance within a depth- $i$ coset is $d_i$ , by construction of the partition chain.

Proof

Coset identification

By the defining property of partition-based labelling (def-partition-labelling), the preimage under $\mu$ of the coset $\mathcal{A}_i^{(b_0,\ldots,b_{i-1})}$ is exactly $\{(b_0,\ldots,b_{i-1})\} \times \{0,1\}^{L-i}$ . So knowing the first $i$ label bits identifies the coset uniquely. There are $|\mathcal{A}_i| = M/2^i = 2^{L-i}$ points inside it, indexed by the remaining $L-i$ bits.

Binary sub-problem at level $i$

Among the $2^{L-i}$ points, the further partition at level $i+1$ splits them into two equal halves. The label bit $b_i$ picks one half. The two halves are at intra-subset distance $d_i$ from one another by definition of the level- $i$ partition distance. So the level- $i$ decision is a binary decision with minimum distance $d_i$ , corrupted by the same AWGN that corrupts the original channel. $\blacksquare$

Example: Partition-Based Labelling for 8-PSK

Consider 8-PSK with points $x_m = e^{j \pi m / 4}$ , $m = 0, 1, \ldots, 7$ . Construct the Ungerboeck partition chain $\mathcal{A}_0 \supset \mathcal{A}_1 \supset \mathcal{A}_2 \supset \mathcal{A}_3$ and write out the partition-based label for each point. Compute the intra-subset squared distance at each level.

Solution

Top level: all 8 points

$\mathcal{A}_0 = \{x_0, \ldots, x_7\}$ . Minimum squared distance is $d_0^2 = |x_0 - x_1|^2 = (2 \sin(\pi/8))^2 \approx 0.586$ at unit $E_s$ .

Level 1 partition: QPSK sub-constellations

Split by even-odd index: $\mathcal{A}_1^{(0)} = \{x_0, x_2, x_4, x_6\}$ (a QPSK rotated by $0$ ) and $\mathcal{A}_1^{(1)} = \{x_1, x_3, x_5, x_7\}$ (a QPSK rotated by $\pi/8$ ). Intra-subset squared distance is $d_1^2 = |x_0 - x_2|^2 = 2$ .

Level 2 partition: BPSK sub-constellations

Inside $\mathcal{A}_1^{(0)}$ split by diagonal: $\mathcal{A}_2^{(0,0)} = \{x_0, x_4\}$ and $\mathcal{A}_2^{(0,1)} = \{x_2, x_6\}$ . Each is an antipodal pair with squared distance $d_2^2 = 4$ . Similarly for $\mathcal{A}_1^{(1)}$ .

Level 3: individual points

At level $L = 3$ each coset is a singleton. The third label bit picks one of the two antipodal points inside a level-2 coset.

Label table

Reading the partition tree root-to-leaf gives labels $x_0 \to 000$ , $x_4 \to 001$ , $x_2 \to 010$ , $x_6 \to 011$ , $x_1 \to 100$ , $x_5 \to 101$ , $x_3 \to 110$ , $x_7 \to 111$ . Note that the labels on the right half of the tree ( $b_0 = 1$ ) correspond to the odd-indexed points — a labelling very different from Gray.

TCM and MLC — Two Answers to the Same Question

It is worth pausing to compare TCM (Chapter 2) and MLC (this chapter) side by side. Both schemes use the same partition chain and the same signal set. The difference is the code:

TCM uses a single convolutional (or trellis) code over the levels that need protection (typically the lowest $k$ levels), and the remaining levels are uncoded. The code lives in the super-alphabet of subsets, not on any individual binary channel.
MLC uses separate binary codes, one per level. Each code operates over its own binary channel at its own rate.

Both schemes realise the coding-gain criterion $d_{\min} \to d_{\rm free}$ by trading rate at lower levels for distance. The difference in the receiver is equally sharp: TCM uses a single Viterbi decoder over the joint trellis; MLC uses $L$ separate decoders, one per level, in sequence. This is multistage decoding, to which we turn in s03.

,

Common Mistake: The levels are independent at the encoder, not at the decoder

Mistake:

Asserting that MLC is " $L$ independent binary codes running in parallel" and therefore that its performance is determined by the worst binary code.

Correction:

The encoders are indeed independent — each level sees a separate binary code. But the decoders are not independent: multistage decoding (which achieves the capacity rule) decodes the levels sequentially, and each stage conditions on the previously decoded bits. If the decoding were done independently level by level, the receiver would see the unconditional mutual information $I(Y; B_i)$ at each level, which is exactly the BICM capacity — strictly less than the CM capacity in general. The conditioning is precisely what closes the CM vs BICM gap.

Multilevel coding (MLC)

A coded-modulation construction in which $L = \log_2 M$ binary codes, one per level of a partition chain of the constellation, are concatenated with a partition-based labelling map. Due to Imai and Hirasawa (1977).

Partition chain

A nested sequence of refinements of a constellation in which each step halves the number of points per coset while increasing the intra-subset minimum distance. The Ungerboeck chain is the canonical example.

Quick Check

For the Ungerboeck partition chain of 8-PSK at unit $E_s$ , what is the ratio $d_2^2 / d_0^2$ between the level-2 and level-0 intra-subset squared distances?

about 2

about 3.4

about 6.8

exactly 4

Correction:

about 6.8

We have $d_0^2 = (2\sin(\pi/8))^2 \approx 0.586$ and $d_2^2 = 4$ . Their ratio is $4 / 0.586 \approx 6.83$ , i.e.\ about $8.3$ dB. This large asymmetry is what drives the rate-allocation story: level 2 is essentially antipodal BPSK, while level 0 is severely noise-limited.

Key Takeaway

MLC is a modular coded-modulation construction. Each of the $L = \log_2 M$ levels of a partition chain gets its own binary code. The encoder combines the $L$ code streams column-by-column via a partition-based labelling map, producing a stream of constellation symbols. The construction is flexible — any binary codes can be plugged in — and makes the asymmetry of the binary sub-channels (noisy at the top, near-noise-free at the bottom) available for exploitation by the rate-allocation rule of the next section.

The Multilevel Coding Framework

A Different Way to Marry Code and Modulation

Definition: Partition Chain of a Constellation

Definition: Multilevel Code (MLC) Encoder

MLC Encoder Block Diagram

Binary Sub-Channel Capacities at Each Level

Parameters

Definition: Partition-Based (Ungerboeck) Labelling

Theorem: Coset Decoding from the First iii Bits

Coset identification

Binary sub-problem at level $i$

Example: Partition-Based Labelling for 8-PSK

Top level: all 8 points

Level 1 partition: QPSK sub-constellations

Level 2 partition: BPSK sub-constellations

Level 3: individual points

Label table

TCM and MLC — Two Answers to the Same Question

Common Mistake: The levels are independent at the encoder, not at the decoder

Multilevel coding (MLC)

Partition chain

Quick Check

Key Takeaway

Definition:
Partition Chain of a Constellation

Definition:
Multilevel Code (MLC) Encoder

Definition:
Partition-Based (Ungerboeck) Labelling

Theorem: Coset Decoding from the First $i$ Bits