Mapping by Set Partitioning and Distance Maximization

From Partition to Mapping: Turning Geometry into Bits

In Β§Set Partitioning of Signal Constellations we saw that an Ungerboeck partition of a 2m2^m-point constellation produces m+1m+1 levels of cosets, each level at least doubling the intra-subset MSED. That is a property of the geometry. To build a coded-modulation scheme we need a mapping from binary sequences to constellation points β€” a rule that tells the encoder which point to transmit given the output of the convolutional code.

The punchline is this: if we label each constellation point with a binary address bmβˆ’1β‹―b0b_{m-1} \cdots b_0 such that the first ii bits identify the level-ii coset, then a convolutional code acting on the low-order bits (the coset-selecting bits) automatically enjoys a large intra-sequence Euclidean distance, because flipping a low-order bit moves us between points of the same coset at a high partition level. The high-order bits (uncoded) pick a specific point inside the coset.

This is the entire recipe: partition β†’ label β†’ separate coded from uncoded bits β†’ push the code to the bits that select the smallest coset. The rest of the chapter spells out the consequences.

Definition:

Trellis-Coded Modulation (TCM)

A trellis-coded modulation (TCM) scheme transmits mm information bits per channel use from a constellation X\mathcal{X} of size 2m+12^{m+1} (one redundancy bit per symbol) according to the following encoder:

  1. Of the mm information bits per symbol, m~≀m\tilde{m} \leq m are fed into a rate-m~/(m~+1)\tilde{m}/(\tilde{m}+1) convolutional encoder whose (m~+1)(\tilde{m}+1)-bit output selects a level-(m~+1)(\tilde{m}+1) coset of an Ungerboeck partition.
  2. The remaining mβˆ’m~m - \tilde{m} uncoded information bits select one of the 2mβˆ’m~2^{m - \tilde{m}} points inside the chosen coset.

The free Euclidean distance of the TCM scheme is the minimum Euclidean distance between any two distinct code sequences it can produce.

The spectral efficiency is mm bits/symbol β€” same as an uncoded 2m2^m-point constellation. The constellation has been doubled to 2m+12^{m+1}, but no additional bandwidth is used. This is the "coding gain without bandwidth expansion" feature Ungerboeck emphasized.

,

Definition:

Parallel Transitions

In the TCM trellis, two edges between the same pair of states that differ only in the uncoded bits are called parallel transitions. They correspond to distinct constellation points lying in the same level-(m~+1)(\tilde{m}+1) coset.

The contribution of parallel transitions to the free Euclidean distance is at least Ξ”m~+1\Delta_{\tilde{m}+1}, because two parallel-transition sequences differ in exactly one symbol and the two competing symbols lie in the same coset at level m~+1\tilde{m}+1.

Parallel transitions exist only when m~<m\tilde{m} < m, i.e., when at least one information bit is left uncoded. They can dominate the free distance when the convolutional code is strong enough that non-parallel error events are farther away β€” more on this in s03.

,

Definition:

Subset Decoding (Within-Coset Nearest-Point Rule)

Given a received sample y∈Cy \in \mathbb{C} and a coset DD, the subset decoder outputs

x^D=arg min⁑x∈Dβˆ₯yβˆ’xβˆ₯2.\hat{x}_D = \operatorname*{arg\,min}_{x \in D} \|y - x\|^2.

In TCM, subset decoding is performed as a preprocessing step for every coset at every time step: the Viterbi algorithm then operates on a trellis whose branch metric at each transition is the squared Euclidean distance to the subset-decoded point of the corresponding coset.

Subset decoding decouples the uncoded-bit decision from the trellis search, reducing the effective branch-metric computation to "find nearest point in coset" (2mβˆ’m~2^{m - \tilde{m}} candidates) followed by Viterbi over the much smaller coset trellis.

,

Definition:

Free Euclidean Distance

The free (squared) Euclidean distance dfree2d_{\rm free}^2 of a TCM scheme is the minimum squared Euclidean distance between any two distinct infinite code sequences x=(x0,x1,x2,…)\mathbf{x} = (x_0, x_1, x_2, \ldots) and xβ€²=(x0β€²,x1β€²,x2β€²,…)\mathbf{x}' = (x_0', x_1', x_2', \ldots) produced by the encoder:

dfree2β€…β€Šβ‰œβ€…β€Šmin⁑xβ‰ xβ€²βˆ‘kβ‰₯0βˆ₯xkβˆ’xkβ€²βˆ₯2.d_{\rm free}^2 \;\triangleq\; \min_{\mathbf{x} \neq \mathbf{x}'} \sum_{k \geq 0} \|x_k - x_k'\|^2.

Since the TCM trellis is linear in the state transitions (not in the symbol values), dfree2d_{\rm free}^2 can be computed by examining all finite-length error events β€” paths that diverge from a common state at time k1k_1 and first remerge at time k2>k1k_2 > k_1.

The free distance plays the role the minimum Hamming distance dfree(H)d_{\rm free}^{(H)} plays in classical convolutional codes β€” it governs the asymptotic error performance on AWGN via the union bound. The bit-error probability at high SNR is approximately Peβ‰ˆNfree Q ⁣(dfree2/(2N0))P_e \approx N_{\rm free}\,Q\!\left(\sqrt{d_{\rm free}^2 / (2N_0)}\right), where NfreeN_{\rm free} is the number of error events achieving dfree2d_{\rm free}^2 (the "error coefficient").

,

Parallel Transitions

Two or more trellis edges connecting the same pair of states. In TCM they are labeled with distinct points from the same level-(m~+1)(\tilde{m}+1) coset, and the distance between parallel edges equals the intra-subset MSED at that level. Parallel transitions bound the free distance from above: dfree2≀Δm~+12d_{\rm free}^2 \leq \Delta_{\tilde{m}+1}^2.

Related: Parallel Transitions, Free Euclidean Distance

Subset Decoding

The preprocessing step in TCM where, for each coset and each received sample, the nearest point within that coset is found and its squared distance to the received sample is used as the branch metric in the Viterbi decoder. Reduces decoding complexity from ∣X∣|\mathcal{X}| to 2m~+12^{\tilde{m}+1} branches per time step, times the in-coset search cost.

Related: Subset Decoding (Within-Coset Nearest-Point Rule), Parallel Transitions

Free (Euclidean) Distance

The minimum squared Euclidean distance between any two distinct code sequences in a TCM scheme. Analogue of free Hamming distance for binary convolutional codes. On AWGN, dfree2d_{\rm free}^2 determines the asymptotic coding gain via Ξ³c=dfree2/duncoded2\gamma_c = d_{\rm free}^2 / d_{\rm uncoded}^2.

Related: Free Euclidean Distance

Theorem: Lower Bound on Free Euclidean Distance via Partition Levels

Consider a TCM scheme built from a rate-m~/(m~+1)\tilde{m}/(\tilde{m}+1) convolutional code driving an Ungerboeck partition of a 2m+12^{m+1}-point constellation. Let Ξ”i2\Delta_i^2 denote the intra-subset MSED at level ii of the partition and let dfree(H)d_{\rm free}^{(H)} be the minimum Hamming distance of the convolutional code when its output labels are read as binary addresses of the 2m~+12^{\tilde{m}+1} level-(m~+1)(\tilde{m}+1) cosets. Then

dfree2β€…β€Šβ‰₯β€…β€Šmin⁑ ⁣{ Δm~+12,β€…β€Šβ€…β€ŠΞ”12β‹…dfree(H) }.d_{\rm free}^2 \;\geq\; \min\!\left\{\,\Delta_{\tilde{m}+1}^2,\;\; \Delta_1^2 \cdot d_{\rm free}^{(H)}\,\right\}.

The first term bounds error events that traverse parallel transitions (single-symbol errors within a coset); the second bounds error events that stay on distinct coset sequences for at least dfree(H)d_{\rm free}^{(H)} time steps.

Two error mechanisms compete. Parallel transitions give an error at minimum-distance Ξ”m~+1\Delta_{\tilde{m}+1}. Coset-sequence errors accumulate Euclidean distance along each time step where the two coset sequences differ β€” each such time step contributes at least Ξ”12\Delta_1^2 (the distance between the two level-1 cosets) and the two sequences differ in at least dfree(H)d_{\rm free}^{(H)} time steps by the Hamming-distance property of the convolutional code. The free Euclidean distance is then the minimum of the two mechanisms. A good Ungerboeck design balances them: pick m~\tilde{m} and the convolutional code so that Ξ”m~+12β‰ˆΞ”12β‹…dfree(H)\Delta_{\tilde{m}+1}^2 \approx \Delta_1^2 \cdot d_{\rm free}^{(H)}.

,

BER of Ungerboeck 8-PSK TCM vs. Uncoded QPSK

Bit-error rate of Ungerboeck's 8-PSK TCM family at various numbers of trellis states (Ns∈{4,8,16,32,64}N_s \in \{4, 8, 16, 32, 64\}) compared to uncoded QPSK (same 2 bit/symbol spectral efficiency) on AWGN. Select the number of states to highlight. Observe that adding states (longer memory) improves coding gain, but with diminishing returns: 4-state gives ~3 dB, 8-state ~3.6 dB, 16-state ~4.1 dB, 64-state ~4.6 dB. The gap to the ultimate coding-gain ceiling (∼6\sim 6 dB, set by the partition geometry) is never closed by trellis memory alone.

Parameters

Example: Free Euclidean Distance of the 4-State 8-PSK TCM

Consider Ungerboeck's canonical 4-state 8-PSK TCM: m=2m = 2 information bits per symbol, m~=1\tilde{m} = 1 coded bit, rate-1/21/2 convolutional code with generator polynomials (g0,g1)=(5,2)oct(g_0, g_1) = (5, 2)_{\rm oct} (in systematic feedback form β€” the encoder has 2 memory cells, so Ξ½=2\nu = 2 and Ns=4N_s = 4). The code output selects a level-2 coset of the 8-PSK partition (4 cosets of 2 points each, with Ξ”22=2\Delta_2^2 = 2). The 1 uncoded bit selects a point inside the coset, with Ξ”32=4\Delta_3^2 = 4 distance to its coset partner.

Compute a lower bound on dfree2d_{\rm free}^2 using TLower Bound on Free Euclidean Distance via Partition Levels, then state the actual value from Ungerboeck's Table I (1982).

,

BER of Rate-1/21/2 Convolutional Codes over BPSK

Bit-error rate (coded) of rate-1/21/2 binary convolutional codes of constraint length K∈{3,4,5,6,7}K \in \{3, 4, 5, 6, 7\} over BPSK/AWGN. This is the workhorse building block of TCM: the convolutional code drives the coset-selection bits. Notice how quickly adding constraint length pays off β€” adding one memory cell roughly doubles the decoder complexity but gains ∼0.3\sim 0.3 dB.

This plot is the binary precursor to the TCM BER plot in this section: combining one of these convolutional codes with an Ungerboeck 8-PSK partition yields the corresponding row of Ungerboeck's Table I.

Parameters

Common Mistake: Hamming Distance and Euclidean Distance Are Not the Same Currency

Mistake:

Confusing the free Hamming distance dfree(H)d_{\rm free}^{(H)} of the underlying convolutional code with the free Euclidean distance dfree2d_{\rm free}^2 of the TCM scheme.

Correction:

  • dfree(H)d_{\rm free}^{(H)} measures how many bit positions two distinct coset sequences must differ in β€” a property of the convolutional code alone.
  • dfree2d_{\rm free}^2 measures squared Euclidean distance in the signal space β€” a property of the convolutional code and the partition together.

These two scale differently with code memory: doubling the states of a binary convolutional code roughly adds 11 to dfree(H)d_{\rm free}^{(H)}; doubling the states of a TCM code typically adds Ξ”12\Delta_1^2 to dfree2d_{\rm free}^2 (about 3 dB worth), but only until parallel transitions become the bottleneck. Once they do, increasing state count buys nothing until m~\tilde{m} is also increased.

Common Mistake: Mixing Up Coded and Uncoded Bits in the TCM Encoder

Mistake:

Believing that all mm information bits must go through the convolutional encoder.

Correction:

Only m~≀m\tilde{m} \leq m of the bits are coded. The remaining mβˆ’m~m - \tilde{m} ride uncoded through the encoder β€” they select a specific point within the coset chosen by the code. This is the whole point of the partition trick: high-order uncoded bits live in a large intra-subset MSED, so they do not need protection. Forcing all bits through the code wastes trellis resources on bits that are already geometrically robust.

Quick Check

In the 4-state 8-PSK TCM of EFree Euclidean Distance of the 4-State 8-PSK TCM, the free Euclidean distance is dfree2=4d_{\rm free}^2 = 4. Which type of error event achieves this minimum?

A long error event with dfree(H)=3d_{\rm free}^{(H)} = 3 differing coset positions.

A single-symbol error along parallel transitions (within one coset).

An uncoded-bit-only error in the rightmost flat region of the trellis.

A catastrophic error event from a poorly chosen generator polynomial.

Quick Check

A TCM scheme transmits 3 bits/symbol at baud rate 1/Ts1/T_s. The constellation size is doubled to 16-QAM, and m~=2\tilde{m} = 2 of the 3 information bits are coded by a rate-2/32/3 convolutional code. What is the spectral efficiency and what is the rate of bandwidth expansion compared to uncoded 8-PSK at the same bit rate?

3 bit/symbol, bandwidth expansion factor 11 (no expansion).

3 bit/symbol, bandwidth expansion factor 3/23/2.

2 bit/symbol, bandwidth expansion factor 11.

4 bit/symbol, bandwidth expansion factor 4/34/3.

Hamming Distance (binary codes) vs. Euclidean Distance (TCM)

PropertyBinary convolutional codeTCM
Distance currencyHamming (dHd_H)Squared Euclidean (d2d^2)
Relevant minimum distancedfree(H)d_{\rm free}^{(H)}dfree2d_{\rm free}^2
Code operates onBitsSignal points (via coset indices)
Soft-decision gain over hard∼2\sim 2 dBBuilt-in (decoder is inherently soft)
Spectral efficiencyReduced by rate RRUnchanged β€” constellation expanded instead
Dominant error mechanismAlways a dfree(H)d_{\rm free}^{(H)}-length eventEither parallel transitions or long events β€” whichever is smaller

Key Takeaway

A good TCM design balances parallel transitions against longer error events. The free distance is dfree2=min⁑{Ξ”m~+12,β€…β€Šdfree(H)β‹…Ξ”12}d_{\rm free}^2 = \min\{\Delta_{\tilde{m}+1}^2,\; d_{\rm free}^{(H)} \cdot \Delta_1^2\}. A design with a very strong code but small m~\tilde{m} (few coded bits) is parallel-transition-limited β€” the geometry, not the code, sets the ceiling. A design with a weak code but many coded bits is Hamming-distance-limited. Ungerboeck's tables pick (m~,Ξ½,g0,g1)(\tilde{m}, \nu, g_0, g_1) so that the two terms are close to equal at each state count.