Trellis Structure and the Ungerboeck Design Rules

The Trellis as the Design Blueprint

The convolutional code inside a TCM scheme is conveniently pictured as a trellis: a directed graph whose nodes are the Ns=2Ξ½N_s = 2^\nu encoder states and whose edges are labelled by the (coset index, uncoded bits) pair of each possible transition. One "section" of the trellis corresponds to one channel use; the complete code is obtained by concatenating sections.

Ungerboeck's 1982 paper spends less time on the trellis itself than on the rules for labelling its edges so that the free Euclidean distance is maximized. Those three rules β€” now known as the Ungerboeck design rules β€” turn out to be deceptively simple. They are the single most important engineering output of the paper, and they still drive TCM table constructions today.

In this section we make the trellis structure explicit, state the three rules, and explain precisely why each one is needed β€” each rule eliminates one type of near-minimum-distance error event that would otherwise collapse dfree2d_{\rm free}^2 back to its Hamming-distance ceiling.

Definition:

Trellis (of a Convolutional Code)

The trellis of a convolutional encoder with ν\nu memory cells is the time-indexed directed graph whose nodes at time kk are the Ns=2νN_s = 2^\nu possible encoder states sk∈{0,1}νs_k \in \{0, 1\}^\nu, and whose edges from sks_k to sk+1s_{k+1} correspond to each admissible state transition under one input symbol. Each edge is labelled with the encoder output for that transition.

One section of the trellis shows edges from time kk to time k+1k+1. A path is a sequence of edges consistent with the state-transition rule; an error event is a pair of paths that share a starting state, diverge at some time, and first remerge to a common state at a later time.

A binary code of rate m~/(m~+1)\tilde{m}/(\tilde{m}+1) has 2m~2^{\tilde{m}} edges leaving each state (one per input pattern) and 2m~2^{\tilde{m}} edges arriving at each state. For TCM each edge's label is converted via the Ungerboeck partition into a constellation point (plus uncoded-bit choices producing parallel transitions).

,

Definition:

Ungerboeck TCM Code

An Ungerboeck TCM code is the pair (C,P)(\mathcal{C}, \mathcal{P}) where

  • C\mathcal{C} is a rate-m~/(m~+1)\tilde{m}/(\tilde{m}+1) convolutional code (specified by generator polynomials or a state-transition table), and
  • P\mathcal{P} is an Ungerboeck partition of a 2m+12^{m+1}-point constellation.

The encoder:

  1. Reads mm information bits umβˆ’1β‹―u0u_{m-1} \cdots u_0 per symbol.
  2. Feeds the lowest m~\tilde{m} bits um~βˆ’1β‹―u0u_{\tilde{m}-1} \cdots u_0 into C\mathcal{C}; the output is m~+1\tilde{m}+1 bits labelling a level-(m~+1)(\tilde{m}+1) coset of P\mathcal{P}.
  3. Uses the remaining mβˆ’m~m - \tilde{m} uncoded bits to select a point inside the coset.

The tables in Ungerboeck 1982 (Table I for 8-PSK, Table II for 16-QAM, Table III for 16-PSK) list the optimal (C,P)(\mathcal{C}, \mathcal{P}) pairs for each state count Ns∈{4,8,16,32,64,128,256}N_s \in \{4, 8, 16, 32, 64, 128, 256\}, together with the achieved dfree2d_{\rm free}^2 and the asymptotic coding gain over the uncoded baseline.

,

Definition:

Euclidean Distance Profile

The Euclidean distance profile of a TCM code is the enumeration of error events by their squared Euclidean distance:

D2={d12<d22<d32<⋯ }\mathcal{D}^2 = \{d_1^2 < d_2^2 < d_3^2 < \cdots\}

with associated multiplicities {N1,N2,…}\{N_1, N_2, \ldots\}. The smallest entry d12=dfree2d_1^2 = d_{\rm free}^2 dominates the high-SNR BER via the union bound; the next few entries matter at moderate SNR where the "knee" of the BER curve sits.

In practice, Ungerboeck only reports (dfree2,Nfree)(d_{\rm free}^2, N_{\rm free}) in his tables. Refined analysis (Biglieri 2005, Ch. 10) computes the full distance spectrum via the transfer-function bound, which treats the trellis as a signal flow graph with distance-weighted edges.

,

Error Event

A pair of trellis paths that share a common state, diverge at some time, and first remerge to a common state at a later time. Error events are the atomic units of error analysis: the union bound on BER is a sum over all error events weighted by their squared Euclidean distance.

Related: Trellis (of a Convolutional Code), Free Euclidean Distance

Ungerboeck Design Rules

Three heuristic rules, introduced in Ungerboeck (1982), for assigning signal points to trellis branches so that the free Euclidean distance is maximized: (R1) Parallel transitions are labelled with points of the deepest non-trivial partition level; (R2) Transitions originating from or merging into any single state are labelled with points of the next-deepest level; (R3) All signal points are used with equal frequency (uniform input).

Related: Ungerboeck TCM Code, Parallel Transitions

Theorem: Ungerboeck Design Rules

To maximize the free Euclidean distance dfree2d_{\rm free}^2 of a TCM code built from an Ungerboeck partition of a 2m+12^{m+1}-point constellation and a rate-m~/(m~+1)\tilde{m}/(\tilde{m}+1) convolutional code, assign signal points to trellis transitions so that the following three rules are satisfied.

(R1) Parallel transitions (transitions between the same pair of states) carry signal points from the same level-(m~+1)(\tilde{m}+1) coset β€” i.e., a maximally-separated antipodal pair (or the equivalent for larger mβˆ’m~m - \tilde{m}).

(R2) Transitions originating from a single state, or merging into a single state, carry signal points from the same level-m~\tilde{m} coset β€” i.e., all such edges are labelled with points whose pairwise intra-subset distance is at least Ξ”m~\Delta_{\tilde{m}}.

(R3) Uniform input. All signal points are used with equal frequency when the input bits are uniform i.i.d.; equivalently, the mapping is a "regular" labeling (a group homomorphism for the constellation's symmetry group).

Any TCM code satisfying (R1)–(R3) achieves dfree2d_{\rm free}^2 at least as large as the bound in TLower Bound on Free Euclidean Distance via Partition Levels.

Each rule eliminates a particular class of "weak" error event.

(R1) Without it, a parallel-transition error could use two points from different level-(m~+1)(\tilde{m}+1) cosets (if the trellis allowed it), which would cost only Ξ”m~2<Ξ”m~+12\Delta_{\tilde{m}}^2 < \Delta_{\tilde{m}+1}^2. (R1) forces parallel transitions to cost at least Ξ”m~+12\Delta_{\tilde{m}+1}^2.

(R2) Without it, a length-2 error event (diverge, remerge after one step) could use two pairs of points each with small inter-point distance, giving a length-2 contribution smaller than 2Ξ”m~22 \Delta_{\tilde{m}}^2. (R2) forces this to cost at least 2Ξ”m~22 \Delta_{\tilde{m}}^2.

(R3) Without it, some signal points would be used more frequently than others, producing a non-uniform average signal energy and no longer saturating the AWGN capacity at the chosen SNR\text{SNR}.

, ,

One Section of the Canonical 4-State 8-PSK Ungerboeck Trellis

One time-slice of Ungerboeck's 4-state 8-PSK TCM trellis. The four states on the left connect to the four states on the right via labelled edges: each edge displays the constellation point transmitted on that transition. Toggle parallel transitions to see how the m~=1\tilde{m} = 1 coding bit leaves mβˆ’m~=1m - \tilde{m} = 1 uncoded bit, producing pairs of parallel edges that carry antipodal 8-PSK points (distance Ξ”22=4\Delta_2^2 = 4). The "nice" structure visible here β€” transitions into each state form a rotated-QPSK set β€” is exactly rule (R2) in action.

Parameters

Example: Verifying the Design Rules on the 4-State 8-PSK TCM

For the canonical 4-state 8-PSK TCM of EFree Euclidean Distance of the 4-State 8-PSK TCM (rate-1/21/2 convolutional code, m~=1\tilde{m} = 1, m=2m = 2, Ξ½=2\nu = 2, Ns=4N_s = 4), verify the three Ungerboeck design rules.

Historical Note: Viterbi and Forney: From Algorithm to Trellis Concept

1967–1988

Andrew Viterbi introduced what is now called the Viterbi algorithm in a 1967 IEEE Transactions on Information Theory paper as a maximum-likelihood decoder for convolutional codes; the 1971 paper "Convolutional codes and their performance in communication systems" (IEEE Trans. Commun. Tech.) popularized the bit-error analysis via error events. At the time, neither the algorithm nor its complexity were widely appreciated outside a small coding-theory circle.

G. David Forney Jr.'s 1973 Proceedings of the IEEE paper "The Viterbi algorithm" reframed the decoder as a shortest-path search on a trellis β€” the graph-theoretic picture we take for granted today. This reframing turned a specialized coding technique into a general tool: once the trellis picture was available, it became clear that any finite-state code, including Ungerboeck's TCM nine years later, could be decoded by the same dynamic-programming recursion.

The chain Viterbi β†’\to Forney β†’\to Ungerboeck is the clearest arc in the history of coded modulation. Forney's lattice-theoretic reinterpretation of TCM in the 1988 paper "Coset codes I/II" (cited in our Chapter 4) closed the loop by showing that TCM is a special case of lattice coset coding.

,
⚠️Engineering Note

Trellis Depth and Decoder Latency

A Viterbi decoder for a TCM code must store the survivor-path history deep enough that the traceback is reliable: the rule of thumb is 5Ξ½\nu to 7Ξ½\nu trellis sections, where Ξ½\nu is the convolutional encoder memory. At 28800 symbols/s (V.34 modem baud rate), a 64-state TCM (Ξ½=6\nu = 6) needs survivor paths of length 30–40 symbols, giving roughly 1 ms of decoding latency. This was manageable in 1994 with dedicated DSP silicon; scaling to gigasymbol/s modern systems with Ξ½=8\nu = 8 or more requires pipelined architectures and is why early 5G NR adopted LDPC over trellis codes (LDPC decoders parallelize more naturally than Viterbi).

Practical Constraints
  • β€’

    Survivor memory scales linearly in NsN_s and linearly in traceback depth

  • β€’

    Branch-metric computation scales linearly in NsN_s and exponentially in m~\tilde{m}

  • β€’

    Traceback latency is the dominant delay at multi-gigasymbol rates

πŸ“‹ Ref: ITU-T V.34 Annex C (Viterbi traceback guidelines)

Common Mistake: Catastrophic Convolutional Codes in TCM

Mistake:

Choosing a convolutional code with (g0,g1)(g_0, g_1) that produces a catastrophic encoder β€” one where a finite number of input errors can cause an infinite number of output errors.

Correction:

A catastrophic code has an error event of infinite length with finite Euclidean distance, which completely breaks the free-distance analysis. For binary rate-1/21/2 codes, the encoder is catastrophic iff gcd⁑(g0,g1)β‰ 1\gcd(g_0, g_1) \neq 1 (as polynomials in F2[D]\mathbb{F}_2[D]).

Always check gcd⁑\gcd before adopting a generator pair from a table. Ungerboeck's 1982 tables are all non-catastrophic; but if you modify a generator polynomial to simplify hardware, rerun the gcd⁑\gcd test.

Quick Check

Which of the three Ungerboeck design rules governs the labelling of parallel transitions (multiple edges between the same pair of states)?

Rule (R1): parallel transitions carry points of the deepest non-trivial partition level.

Rule (R2): transitions from or into a single state carry same-coset points at the next level.

Rule (R3): uniform input distribution.

None of the above β€” parallel transitions are not addressed by Ungerboeck's rules.

Quick Check

In Ungerboeck's Table I (1982), the asymptotic coding gain of 8-PSK TCM over uncoded QPSK as a function of trellis states NsN_s is approximately: 4 states β†’ 3 dB; 8 β†’ 3.6 dB; 16 β†’ 4.1 dB; 32 β†’ 4.6 dB; 64 β†’ 4.8 dB; 128 β†’ 5.0 dB; 256 β†’ 5.4 dB. What is the ultimate ceiling, and why can it not be exceeded by adding states to this family?

∞\infty β€” adding states always helps.

10log⁑10(Ξ”32/duncoded2)=10log⁑10(4/2)=310\log_{10}(\Delta_3^2 / d_{\rm uncoded}^2) = 10\log_{10}(4/2) = 3 dB.

About 66 dB β€” beyond that, the partition geometry (parallel-transition bound) saturates and no more gain is possible from coding bit m~=1\tilde{m}=1 alone.

Exactly Ξ³c=10log⁑10(Ο€e/6)β‰ˆ1.53\gamma_c = 10\log_{10}(\pi e/6) \approx 1.53 dB.

Key Takeaway

Three rules, one optimal code family. Ungerboeck's design rules β€” (R1) parallel transitions carry deepest-coset points, (R2) outgoing and incoming edges of each state carry next-deepest-coset points, (R3) uniform input β€” are individually necessary and jointly sufficient to saturate the free-distance lower bound. Every entry in Ungerboeck's 1982 Tables I–III was produced by a computer search that enforced (R1)–(R3) and then maximized dfree2d_{\rm free}^2 over the remaining freedom. The rules are the bridge between the combinatorial design of the convolutional code and the geometric design of the partition.