Ferkans — Interactive Telecom Tutor

The Turbo Principle

In 1993, Berrou, Glavieux, and Thitimajshima demonstrated a coding scheme that operated within $0.7$ dB of the Shannon limit at a bit-error rate of $10^{-5}$ — a gap roughly three decibels better than the best practical codes known at the time. The construction is deceptively simple: two convolutional encoders separated by a pseudo-random interleaver. The decoder is similarly simple: two maximum-a-posteriori (MAP) decoders that exchange soft extrinsic information iteratively.

The principle behind this success transcends turbo codes and defines an entire class of receivers. Whenever a system can be decomposed into subsystems coupled through interleaving or permutation, one can often derive efficient SISO (soft-input soft-output) decoders for each subsystem and let them exchange extrinsic information iteratively. This is the turbo principle, and it subsumes turbo decoding, turbo equalization, iterative MIMO detection, iterative demapping, and joint source-channel decoding.

Historical Note: Berrou's Pragmatic Leap

1993–1998

Claude Berrou, an engineer at ENST Bretagne, developed turbo codes while searching for practical decoders near capacity. The 1993 ICC paper "Near Shannon limit error-correcting coding and decoding: Turbo-codes" reported simulation results so close to the Shannon bound that most information theorists initially refused to believe them. Robert McEliece later described the community's reaction as disbelief giving way to astonishment. The iterative decoder was not derived from first principles but was discovered experimentally; its rigorous interpretation as the sum-product algorithm on a loopy factor graph was given by McEliece, MacKay, and Cheng in 1998. This was the event that rekindled interest in Gallager's 1962 LDPC codes, launched iterative receiver design as a discipline, and ultimately placed message passing at the centre of modern physical-layer design.

Definition:
Parallel Concatenated Convolutional Code (PCCC)

A parallel concatenated convolutional code with interleaver $\boldsymbol{\Pi}$ of length $K$ encodes an information sequence $\mathbf{u} \in \{0,1\}^K$ as follows. Let $\mathcal{C}_1$ and $\mathcal{C}_2$ denote two recursive systematic convolutional (RSC) component encoders.

Pass $\mathbf{u}$ through $\mathcal{C}_1$ to obtain parity sequence $\mathbf{p}^{(1)}$ .
Pass $\boldsymbol{\Pi}\mathbf{u}$ through $\mathcal{C}_2$ to obtain parity sequence $\mathbf{p}^{(2)}$ .

The codeword is $\mathbf{c} = (\mathbf{u}, \mathbf{p}^{(1)}, \mathbf{p}^{(2)})$ , giving a rate- $1/3$ code. Higher rates are obtained by puncturing parity bits. The encoder is systematic because the information sequence $\mathbf{u}$ appears verbatim in the codeword.

The RSC structure is essential: it ensures that isolated weight-one input patterns generate infinite-weight parity sequences, which is what couples the two encoders through the interleaver and gives turbo codes their excellent distance spectrum.

Turbo Encoder and Iterative Decoder — Top: parallel concatenation of two RSC encoders separated by an interleaver $\boldsymbol{\Pi}$ . Bottom: the iterative decoder. Each SISO MAP decoder consumes a-priori LLRs from its peer (via $\boldsymbol{\Pi}$ or $\boldsymbol{\Pi}^{-1}$ ) and produces extrinsic LLRs back to it.

Definition:
A-priori, A-posteriori, and Extrinsic LLRs

Fix a binary random variable $X \in \{+1, -1\}$ representing a coded bit. A SISO decoder receives channel observations $\mathbf{y}$ and a-priori beliefs about $X$ supplied by another decoder. Define the three LLRs:

$L_{ch} = \log\frac{p(\mathbf{y}|X=+1)}{p(\mathbf{y}|X=-1)}, \qquad L_A = \log\frac{\Pr(X=+1)}{\Pr(X=-1)},$

$L_D = \log\frac{\Pr(X=+1|\mathbf{y})}{\Pr(X=-1|\mathbf{y})}.$

The extrinsic LLR is what the decoder learned beyond the channel observation of $X$ itself and beyond the a-priori input:

$L_E = L_D - L_A - L_{ch}.$

Only $L_E$ may be passed as a-priori input to a peer decoder; reusing $L_D$ would double-count information and destroy the iterative dynamics.

This separation is the algebraic heart of the turbo principle. When two decoders share extrinsic information through an interleaver, the interleaver decorrelates the messages so that each decoder receives information it could not have computed from its own observations.

Iterative Turbo Decoder

Complexity:

O(T_{\max} \cdot K \cdot |\mathcal{S}|)

where

|\mathcal{S}|

is the trellis state count

Input: Channel LLRs for systematic bits

\mathbf{L}_{ch}^{\mathbf{u}}

and for the two parity sequences

\mathbf{L}_{ch}^{(1)}, \mathbf{L}_{ch}^{(2)}

;

interleaver

\boldsymbol{\Pi}

; maximum iterations

T_{\max}

.

Output: Hard decisions

\hat{\mathbf{u}}

.

1. Initialize a-priori LLRs

\mathbf{L}_A^{(1)} \leftarrow \mathbf{0}

.

2. for

t = 1, \ldots, T_{\max}

do

3.

\quad

Run BCJR on

\mathcal{C}_1

with inputs

(\mathbf{L}_{ch}^{\mathbf{u}} + \mathbf{L}_A^{(1)}, \mathbf{L}_{ch}^{(1)})

.

4.

\quad

Extract extrinsic

\mathbf{L}_E^{(1)}

on the info bits.

5.

\quad

Set

\mathbf{L}_A^{(2)} \leftarrow \boldsymbol{\Pi}\,\mathbf{L}_E^{(1)}

.

6.

\quad

Run BCJR on

\mathcal{C}_2

with inputs

(\boldsymbol{\Pi}\mathbf{L}_{ch}^{\mathbf{u}} + \mathbf{L}_A^{(2)}, \mathbf{L}_{ch}^{(2)})

.

7.

\quad

Extract extrinsic

\mathbf{L}_E^{(2)}

.

8.

\quad

Set

\mathbf{L}_A^{(1)} \leftarrow \boldsymbol{\Pi}^{-1}\,\mathbf{L}_E^{(2)}

.

9.

\quad

if stopping criterion met then break.

10. end for

11. Form

\mathbf{L}_D \leftarrow \mathbf{L}_{ch}^{\mathbf{u}} + \mathbf{L}_A^{(1)} + \mathbf{L}_E^{(1)}

.

12. return

\hat{\mathbf{u}} = \text{sign}(\mathbf{L}_D)

.

Each BCJR call is a two-sweep forward-backward recursion on the component trellis. The extrinsic extraction in steps 4 and 7 subtracts the inputs that came from outside (a-priori plus channel LLR on the information bit) from the a-posteriori LLR.

Theorem: Turbo Decoding as Loopy Belief Propagation

The iterative turbo decoder of Algorithm AIterative Turbo Decoder is exactly the sum-product (belief propagation) algorithm applied to the factor graph of the PCCC, with flooding schedule and messages represented as log-likelihood ratios. The factor graph has cycles of length at least equal to the interleaver girth; turbo decoding is therefore an instance of loopy belief propagation.

Each BCJR call computes exact bitwise marginals on a tree (the trellis of one component code). The interleaver couples the trees into a single loopy graph. Passing extrinsic LLRs is precisely passing the messages that sum-product would pass between the two trees along the edges they share.

Proof

Factor graph of the PCCC

The joint posterior $p(\mathbf{u}|\mathbf{y})$ factors as $p_1(\mathbf{u}, \mathbf{p}^{(1)} | \mathbf{y}^{(1)}) \cdot p_2(\boldsymbol{\Pi}\mathbf{u}, \mathbf{p}^{(2)} | \mathbf{y}^{(2)})$ , where $p_i$ is the indicator of the $i$ -th trellis constraint weighted by channel likelihoods. Drawing this gives a factor graph with two trellis sub-graphs joined at the information bits through the interleaver.

Messages along the coupling edges

In sum-product, each information-bit variable node sends to factor $p_2$ the product of incoming messages from every other neighbour — in this case, the channel factor and factor $p_1$ . On a tree, the message from $p_1$ to variable $u_k$ is exactly the extrinsic LLR $L_E^{(1)}[k]$ output by BCJR (the sum-product algorithm on the $\mathcal{C}_1$ trellis).

LLR representation and flooding schedule

Expressing messages as LLRs converts the product of messages into a sum. The flooding schedule corresponds to alternately running BCJR on $\mathcal{C}_1$ and $\mathcal{C}_2$ : all $\mathcal{C}_1$ edges are updated in parallel, then all $\mathcal{C}_2$ edges. This recovers Algorithm AIterative Turbo Decoder line-for-line. $\blacksquare$

Key Takeaway

Turbo decoding is loopy belief propagation on the PCCC factor graph with messages expressed as LLRs and updated by BCJR within each component trellis. The only information the component decoders exchange is extrinsic, so the interleaver can decorrelate successive iterations and the decoder behaves as if the graph were locally a tree.

Why EXIT Charts?

Loopy BP is not guaranteed to converge to the true posterior, and even when it does the rate of convergence can be hard to predict. Ten Brink (2001) observed that for long interleavers the distribution of extrinsic LLRs at each iteration is, to a good approximation, symmetric Gaussian. Under this assumption each SISO block is characterized by a one-dimensional transfer function mapping input mutual information to output mutual information. Composing the transfer functions of the two decoders produces the extrinsic information transfer (EXIT) chart, a graphical tool that predicts convergence thresholds within fractions of a dB.

Definition:
The $J$ -Function

Let $L$ be a real-valued LLR with distribution consistent with the symmetric Gaussian model: conditional on $X = +1$ , $L \sim \mathcal{N}(\sigma^2/2, \sigma^2)$ , and conditional on $X = -1$ , $L \sim \mathcal{N}(-\sigma^2/2, \sigma^2)$ , with $X$ equiprobable. Define

$J(\sigma) \;\triangleq\; I(X; L) = 1 - \int_{-\infty}^{\infty} \frac{e^{-(\ell - \sigma^2/2)^2/(2\sigma^2)}}{\sqrt{2\pi\sigma^2}} \log_2\!\big(1 + e^{-\ell}\big)\,d\ell.$

The function $J$ is continuous, strictly increasing from $J(0) = 0$ to $\lim_{\sigma\to\infty} J(\sigma) = 1$ , and hence invertible. It translates a Gaussian LLR variance into the mutual information carried by the LLR.

A widely used numerical approximation of $J$ and its inverse was given by ten Brink and Ashikhmin–Kramer–ten Brink; the Hamming-distance symmetry of coded binary channels makes the consistency condition $\mathbb{E}[L|X=+1] = \text{Var}(L)/2$ natural.

Definition:
EXIT Function of a SISO Block

Consider a SISO block (a decoder or equalizer) that receives a-priori LLRs $L_A$ modelled as symmetric Gaussian with input mutual information $I_A$ , and observations $\mathbf{y}$ from a channel parameterized by SNR $\rho$ . The EXIT transfer function is

$T(I_A; \rho) \;\triangleq\; I(X; L_E \mid I_A, \rho),$

where $L_E$ is the extrinsic LLR produced by the block and the mutual information is computed assuming $L_A$ follows the symmetric Gaussian model with $J^{-1}(I_A)$ standard deviation. For a convolutional decoder $T_{\text{dec}}(I_A)$ is typically estimated by Monte Carlo using a long random codeword.

The symmetric-Gaussian assumption on $L_A$ is a consistency condition that is preserved under interleaving and BCJR for long blocks. It is not exact — it is an idealization whose accuracy improves with interleaver length.

Theorem: EXIT Chart Prediction of Decoding Threshold

Let $T_1$ and $T_2$ be the EXIT transfer functions of the two component decoders at channel SNR $\rho$ . Under the symmetric-Gaussian message assumption, the trajectory of the turbo decoder starting from $I_A = 0$ is $I_A^{(t+1)} = T_2\!\big(T_1(I_A^{(t)}; \rho); \rho\big), \quad t = 0, 1, \ldots$ The decoder converges to $I_A = 1$ (zero BER) if and only if $T_1(I; \rho) > T_2^{-1}(I; \rho)$ for all $I \in [0, 1)$ , i.e., the curve $T_1$ lies strictly above the mirrored curve $T_2^{-1}$ on the EXIT diagram. The turbo cliff (waterfall SNR) is the smallest $\rho$ at which the two curves just separate.

The staircase trajectory visualizes the composition $T_2 \circ T_1$ iterated from zero. If the EXIT curves cross (intersect) the trajectory gets stuck at the intersection — this is the pinch point that determines the threshold.

Proof

Information at iteration $t$

Define $I_A^{(t)}$ as the mutual information of the a-priori LLR input to decoder 1 at iteration $t$ . By definition of the EXIT function, after processing, decoder 1 outputs extrinsic information $I_E^{(t)} = T_1(I_A^{(t)}; \rho)$ . This becomes the a-priori input to decoder 2 through the interleaver, which preserves mutual information.

Interleaver preserves $I$

A permutation does not alter per-bit mutual information because mutual information is symmetric in the coordinates and the symmetric-Gaussian model is i.i.d. across bits. Hence the a-priori information fed to decoder 2 equals $I_E^{(t)}$ from decoder 1.

Composition and convergence

Decoder 2 then produces $I_E^{(t)} \mapsto T_2(I_E^{(t)}; \rho)$ , yielding $I_A^{(t+1)} = T_2(T_1(I_A^{(t)}; \rho); \rho)$ . The iteration converges to $I_A = 1$ iff the only fixed point of the composition in $[0, 1]$ is $I_A = 1$ . Equivalently, $T_1(I) > T_2^{-1}(I)$ for all $I < 1$ . At the turbo cliff the two curves touch tangentially; below it they intersect and the decoder stalls. $\blacksquare$

EXIT Chart for Rate- $1/2$ Turbo Code

Explore how the EXIT curves of the two RSC component decoders depend on the channel SNR. The "tunnel" between the curves determines whether the iterative decoder can reach $I_E = 1$ . When the tunnel closes, the decoder stalls (this is the turbo cliff).

Parameters

E_b/N_0

(dB)0.7

Channel SNR

RSC memory

m

Component code constraint length $K = m+1$

Show decoding trajectory

Example: Reading the Threshold from an EXIT Chart

A turbo code uses two identical RSC $(1, 5/7)_8$ encoders (memory $m=2$ ) at rate $R = 1/2$ after puncturing. The EXIT functions at channel $E_b/N_0$ are denoted $T(I_A; E_b/N_0)$ . At $E_b/N_0 = 0.5$ dB the curves $T$ and $T^{-1}$ cross at $I = 0.62$ ; at $0.7$ dB they just touch at $I = 0.78$ ; at $0.9$ dB there is an open tunnel. Estimate the decoding threshold and predict the qualitative BER behaviour at these three SNRs.

Solution

Locate the threshold

The threshold is the smallest $E_b/N_0$ for which the EXIT tunnel is open. Because the curves just touch at $0.7$ dB, this is the turbo cliff: $(E_b/N_0)^\star \approx 0.7$ dB.

BER below the threshold

At $0.5$ dB the trajectory stalls at the crossing $I \approx 0.62$ , so the extrinsic mutual information saturates well below $1$ . The residual error probability is bounded away from zero; adding more iterations does not help. BER is in the $10^{-2}$ range — this is the waterfall region.

BER at the threshold

At $0.7$ dB the trajectory barely squeezes through the pinch point. Convergence is extremely slow: hundreds of iterations may be needed. BER transitions rapidly with SNR — a fraction-of-a-dB improvement shifts the BER by orders of magnitude.

BER above the threshold

At $0.9$ dB the tunnel is wide; the trajectory reaches $I \to 1$ within about $10$ iterations and BER falls to the error floor determined by the code's minimum distance (typically $10^{-6}$ or lower for long interleavers).

Turbo Decoder BER vs Iteration Count

Simulated BER of a rate- $1/2$ turbo code versus the number of decoding iterations, at several $E_b/N_0$ values. Above the turbo cliff BER drops by several decades; below it BER stalls.

Parameters

E_b/N_0

(dB)0.7

Max iterations15

Block length

K

EXIT Chart Staircase Trajectory

Animation of the iterative decoder trajectory climbing the staircase between the two EXIT curves. When the tunnel is open, the staircase reaches

(1,1)

; when it closes, the staircase hits the wall.

The horizontal and vertical moves alternate between decoder 1's map

T_1

and the mirrored map

T_2^{-1}

. Convergence is the squeezing between these two curves.

Common Mistake: Do Not Feed Back the A-Posteriori LLR

Mistake:

A common implementation bug is to feed the a-posteriori LLR $L_D = L_A + L_E + L_{ch}$ back to the peer decoder instead of the extrinsic LLR $L_E$ .

Correction:

The peer decoder already has $L_{ch}$ for the systematic bits and supplied $L_A$ itself in the previous iteration. Re-including these causes positive feedback that drives LLRs to unbounded magnitudes, produces overconfident wrong decisions, and breaks the EXIT analysis. Always subtract the inputs before passing to the peer: $L_E = L_D - L_A - L_{ch}$ .

Common Mistake: Gaussian LLR Assumption Fails for Short Blocks

Mistake:

EXIT charts can mislead when the interleaver is short: the symmetric-Gaussian model for $L_A$ is violated and the predicted threshold may be 1 dB or more optimistic.

Correction:

For interleavers below roughly $K = 1000$ , validate threshold predictions with Monte Carlo simulation. For very short blocks (e.g., $K \leq 128$ in 5G control channels) use the actual LLR histograms, PEXIT charts for punctured codes, or direct finite-length simulation.

🔧Engineering Note

Turbo Codes in LTE and the Transition to LDPC/Polar in 5G

LTE (3GPP Release 8, 2008) adopted rate- $1/3$ parallel turbo codes with memory- $3$ RSC components and a quadratic-polynomial-permutation (QPP) interleaver. Block lengths range from $40$ to $6144$ bits. Standard decoders use max-log-MAP with scaling factor $0.7$ to approximate MAP while keeping hardware simple; typically $6$ – $8$ iterations.

5G NR (Release 15, 2018) replaced turbo codes with LDPC for the data channel (PDSCH/PUSCH) and polar codes for control. The reasons were primarily practical: LDPC decoders parallelize better for multi-Gbps throughput, have lower error floors at high rates, and support flexible rate-matching via HARQ. Turbo codes retain a role in earlier 3GPP releases, DVB-RCS2, and space communications (CCSDS 131.0-B).

Practical Constraints

•
LTE turbo: $K \in \{40, \ldots, 6144\}$ , 188 distinct block sizes with QPP interleavers
•
Max-log-MAP with scaling factor $\alpha \approx 0.7$ loses $\approx 0.1$ dB vs true MAP
•
Typical LTE decoders: 6–8 iterations, early stopping via CRC

📋 Ref: 3GPP TS 36.212 §5.1.3

🎓CommIT Contribution(2004)

EXIT-Chart Design of Multi-Edge-Type Concatenated Schemes

G. Caire, G. Taricco, E. Biglieri — IEEE Trans. Information Theory, vol. 44, no. 3

The bit-interleaved coded modulation (BICM) analysis by Caire, Taricco and Biglieri was a forerunner of EXIT-based iterative receiver design. It demonstrated that treating the demapper and decoder as two SISO blocks separated by a bit interleaver yields a parallel-channel model whose capacity can be approached by iterative BICM-ID. The mutual-information decomposition exploited there — channel MI split into a SISO transfer function on each edge — is exactly the tool EXIT charts formalize. This work directly informs modern EXIT-chart design of concatenated codes with higher-order modulation.

bicmexit-chartiterative-demappingView Paper →

Turbo Cliff

The steep drop in bit error rate that occurs near a critical channel SNR when a turbo or iterative receiver transitions from stalled iterations to convergent ones. The threshold SNR is the one at which the EXIT tunnel first opens.

Related: EXIT Chart, Extrinsic LLR

EXIT Chart

A two-dimensional plot of the extrinsic mutual information output by a SISO block against its a-priori mutual information input. Composing the EXIT curves of two SISO blocks predicts the convergence of their iterative composition.

Related: Turbo Cliff

Extrinsic LLR

The log-likelihood ratio of a coded bit obtained by subtracting a-priori and direct channel contributions from the a-posteriori LLR. Passing only extrinsic information between SISO blocks prevents double-counting and is the algebraic basis of the turbo principle.

Related: Turbo Cliff

Why This Matters: Turbo Codes and Hybrid ARQ

The rate-compatible puncturing of turbo codes is ideally matched to incremental-redundancy HARQ: the transmitter sends a high-rate punctured codeword first, and on NACK appends additional parity bits. The receiver concatenates the soft channel LLRs from all transmissions and runs one turbo decoding. This is the basis of 3GPP HSPA and LTE HARQ-IR.

See full treatment in Chapter 22

Quick Check

Suppose two EXIT curves $T_1(I)$ and $T_2^{-1}(I)$ cross at $I = 0.4$ at a given SNR. What is the predicted outcome of iterative turbo decoding?

BER drops to near-zero within a few iterations.

The trajectory is trapped at $I \approx 0.4$ ; BER stalls well above zero.

The decoder diverges: LLR magnitudes grow without bound.

Convergence depends on the interleaver but not on the SNR.

Correction:

The trajectory is trapped at

I \approx 0.4

; BER stalls well above zero.

EXIT trajectories get stuck at curve intersections. This regime is below the turbo cliff.

Quick Check

Given channel LLR $L_{ch} = 1.2$ , a-priori LLR $L_A = 0.8$ , and a-posteriori LLR $L_D = 3.5$ for a systematic bit, what is the extrinsic LLR that should be passed to the peer decoder?

$L_E = 3.5$

$L_E = 1.5$

$L_E = 2.7$

$L_E = 5.5$

Correction:

L_E = 1.5

$L_E = L_D - L_A - L_{ch} = 3.5 - 0.8 - 1.2 = 1.5$ .

Quick Check

The rigorous interpretation of iterative turbo decoding as loopy belief propagation on the PCCC factor graph was given by:

Berrou, Glavieux, and Thitimajshima (1993)

Gallager (1962)

McEliece, MacKay, and Cheng (1998)

ten Brink (2001)

Correction:

McEliece, MacKay, and Cheng (1998)

Their paper showed turbo decoding is sum-product on the PCCC factor graph with LLR messages.

Turbo Codes and Iterative Decoding

The Turbo Principle

Historical Note: Berrou's Pragmatic Leap

Definition: Parallel Concatenated Convolutional Code (PCCC)

Turbo Encoder and Iterative Decoder

Definition: A-priori, A-posteriori, and Extrinsic LLRs

Iterative Turbo Decoder

Theorem: Turbo Decoding as Loopy Belief Propagation

Factor graph of the PCCC

Messages along the coupling edges

LLR representation and flooding schedule

Key Takeaway

Why EXIT Charts?

Definition: The JJJ-Function

Definition: EXIT Function of a SISO Block

Theorem: EXIT Chart Prediction of Decoding Threshold

Information at iteration $t$

Interleaver preserves $I$

Composition and convergence

EXIT Chart for Rate-1/21/21/2 Turbo Code

Parameters

Example: Reading the Threshold from an EXIT Chart

Locate the threshold

BER below the threshold

BER at the threshold

BER above the threshold

Turbo Decoder BER vs Iteration Count

Parameters

EXIT Chart Staircase Trajectory

Common Mistake: Do Not Feed Back the A-Posteriori LLR

Common Mistake: Gaussian LLR Assumption Fails for Short Blocks

Turbo Codes in LTE and the Transition to LDPC/Polar in 5G

EXIT-Chart Design of Multi-Edge-Type Concatenated Schemes

Turbo Cliff

EXIT Chart

Extrinsic LLR

Why This Matters: Turbo Codes and Hybrid ARQ

Quick Check

Quick Check

Quick Check

Definition:
Parallel Concatenated Convolutional Code (PCCC)

Definition:
A-priori, A-posteriori, and Extrinsic LLRs

Definition:
The $J$ -Function

Definition:
EXIT Function of a SISO Block

EXIT Chart for Rate- $1/2$ Turbo Code