Ferkans — Interactive Telecom Tutor

From AWGN to Fading: What Changes

Everything in s01–s02 was AWGN: a deterministic channel where the PEP decays as $\exp(-\alpha E_s/N_0)$ with a constant $\alpha$ , and where "bigger $\alpha$ " is the only design knob. On a fading channel, the physics is different: the received signal is $y = h \cdot s + w$ with random amplitude $|h|$ , and when $|h|$ is small the entire transmission is lost. The PEP is then an EXPECTATION over the fading realisation, and its high-SNR behaviour is not exponential but power-law: $P_e \approx \gamma_c^{-1} \cdot \text{SNR}^{-d}$ . The exponent $d$ is the diversity order — the number of independent fading samples that must all be bad for a decoding error to occur.

The central result of this section — and arguably the central result of the entire Caire–Taricco–Biglieri 1998 paper — is an exact formula for the diversity order of BICM: $d_{\rm BICM}(\mu) = d_H \cdot L_{\min}(\mu).$ This formula replaces $d^2_{\rm avg}(\mu)$ as the primary design criterion on fading. It tells the designer: pick a binary code with large $d_H$ and a labelling with large $L_{\min}$ . The "diversity product" is what you want to maximise, not the AWGN exponent.

The point is that on fading, capacity alone does not tell the full story. Even if BICM's capacity is close to CM capacity, the operating SNR at a target BER depends on the DIVERSITY order achievable. The formula above is the design criterion, and it factors cleanly into "binary code" and "labelling" contributions — which is exactly what makes BICM modular.

Definition:
Fully-Interleaved Rayleigh Fading Channel

The fully-interleaved Rayleigh fading channel has input-output relation at symbol time $i$ $y_i = h_i \cdot s_i + w_i,$ where $s_i \in \mathcal{X}$ is the transmitted constellation symbol, $h_i \in \mathbb{C}$ is the channel gain, and $w_i \sim \mathcal{CN}(0, N_0)$ is complex Gaussian noise. The fading amplitudes satisfy $|h_i|^2 \sim \text{Exp}(1)$ (unit-mean exponential, the squared magnitude of a unit-variance $\mathcal{CN}$ random variable), and the sequence $\{h_i\}$ is i.i.d. — i.e., the interleaver is long enough to fully de-correlate consecutive symbol fadings. The receiver has perfect channel state information (CSI) $\{h_i\}$ .

"Fully interleaved" is an idealisation: it assumes the interleaver length $N$ is infinite relative to the coherence time $T_c$ . Section 5 quantifies what happens when $N$ is finite.

,

Definition:
Diversity Order

The diversity order of a scheme over a fading channel is $d \triangleq \lim_{\text{SNR} \to \infty} \frac{-\log P_e(\text{SNR})}{\log \text{SNR}},$ provided the limit exists. Operationally, $d$ is the high-SNR slope of the log–log BER curve. A BER versus $\text{SNR}$ plot in dB–dB coordinates becomes a straight line at high SNR with slope $-d$ . The coding gain $\gamma_c$ is the horizontal (SNR) offset of that line relative to a reference curve with the same slope.

On AWGN the diversity order is infinite (the BER decays exponentially, not polynomially) — $d$ is a fading-specific concept. For an uncoded system with a single antenna, $d = 1$ (one Rayleigh fade can wipe out the symbol). To exceed $d = 1$ one must spread each information bit across multiple independent fading realisations — which is exactly what a binary code with $d_H$ bit-disagreements does when bits are fully interleaved.

Definition:
Minimum Number of Distinct Bit Positions $L_{\min}(\mu)$

For a labelling $\mu : \{0,1\}^L \to \mathcal{X}$ , define $\ell(\mu, b, \hat b) \triangleq \#\{\text{DISTINCT constellation pairs } (s, \hat s) : \mu^{-1}(s) \text{ has bit value } b, \; \mu^{-1}(\hat s) \text{ has bit value } \hat b, \text{ at the same position}\}$ and the minimum number of distinct bit positions $L_{\min}(\mu) \triangleq \min_{\ell \in \{0, \ldots, L-1\}} \min_{b \ne \hat b} \ell(\mu, b, \hat b).$ Informally, $L_{\min}(\mu)$ counts the smallest number of distinct constellation positions that differ in a single label bit across label pairs, minimised over bit positions and bit-value flips.

For most "reasonable" labellings on symmetric QAM/PSK constellations, $L_{\min}(\mu) = 1$ — a single bit flip often corresponds to a single nearest-neighbour move. Both Gray and Set-Partition labellings on square QAM achieve $L_{\min} = 1$ . A labelling with $L_{\min} = 2$ would need every single bit flip to move across at least two distinct constellation positions simultaneously, a rare property.

Theorem: BICM Diversity Order on Fully-Interleaved Rayleigh Fading

Consider BICM with labelling $\mu$ and binary code of minimum Hamming distance $d_H$ on a fully-interleaved Rayleigh fading channel with receiver CSI. For any two codewords $\mathbf{c}, \hat{\mathbf{c}}$ at Hamming distance $d$ , the PEP is bounded at high SNR by $P(\mathbf{c} \to \hat{\mathbf{c}} \mid \text{fading}) \le \left(\frac{1}{1 + \frac{\text{SNR}}{4} \ell(\mu) d^2_{\rm avg,fad}(\mu)}\right)^{d \cdot L_{\min}(\mu)}$ up to constants, where $\ell(\mu)$ and $d^2_{\rm avg, fad}(\mu)$ are fading-specific averaging quantities. Consequently the diversity order of the BICM codeword error probability is $\boxed{\; d_{\rm BICM}(\mu) = d_H \cdot L_{\min}(\mu). \;}$ For Gray labelling on square QAM, $L_{\min}(\mu_G) = 1$ , so $d_{\rm BICM}(\mu_G) = d_H$ : the binary code's minimum Hamming distance converts directly into diversity order.

Here is the mental picture. The fully-interleaved channel turns the single fading channel into $d_H$ independent parallel sub-channels — one per differing coded bit. Each differing bit position is mapped to a label position with $L_{\min}(\mu)$ distinct constellation choices; in the Gray case, $L_{\min} = 1$ means each such bit goes through a single fading realisation. The diversity order is the number of independent fadings the error event must jointly survive, which is $d_H \cdot L_{\min}(\mu)$ .

Compare this with the CM (no-BICM) case: on fading, TCM with SP labelling achieves diversity $\min(d_H, L_{\rm free})$ where $L_{\rm free}$ is the trellis free distance in symbols. BICM with Gray achieves exactly $d_H$ — it trades the SP geometric structure for a simpler, more interleaver-friendly one.

Show Hint

Start from the conditional AWGN PEP (Thm. 1 of s01) applied to a single fading realisation, with effective SNR $|h|^2 \text{SNR}/N_0$ .

Average over the Rayleigh fading: $\mathbb{E}_{|h|^2}[\exp(-\alpha |h|^2 \text{SNR})] = 1/(1 + \alpha \text{SNR})$ for exponential $|h|^2$ with unit mean.

Under the fully-interleaved assumption, the $d_H$ differing coded bits see INDEPENDENT $|h_i|^2$ realisations; the joint PEP becomes a product of $d_H$ such factors.

Within each bit channel $\ell$ , the expectation further factorises over label-pair alternatives. Apply Jensen to get a lower bound on the exponent involving $L_{\min}(\mu)$ .

The high-SNR behaviour of $(1 + \alpha \text{SNR})^{-k}$ is $\text{SNR}^{-k}$ , so the diversity exponent is the total count of factors: $d_H \cdot L_{\min}(\mu)$ .

Proof

Step 1: Conditional PEP given fading

Condition on the fading realisation $\mathbf{h} = (h_1, \ldots, h_N)$ . By Thm. 1 of s01 applied to the equivalent AWGN channel with per-symbol SNR $|h_i|^2 \text{SNR}$ , the conditional PEP for $\mathbf{c} \to \hat{\mathbf{c}}$ at Hamming distance $d$ is bounded by $P(\mathbf{c} \to \hat{\mathbf{c}} \mid \mathbf{h}) \le \prod_{i : c_i \ne \hat c_i} \exp\!\left(-\frac{|h_i|^2}{4 N_0} \cdot d^2_\ell(\mu, i)\right),$ where $d^2_\ell(\mu, i)$ is the conditional squared distance for the label pair visited by the interleaver at time $i$ .

Step 2: Averaging over Rayleigh fading

Take expectation over the i.i.d. unit-mean exponential $|h_i|^2$ . Using the MGF $\mathbb{E}[\exp(-\alpha |h|^2)] = 1/(1 + \alpha)$ for $\alpha > 0$ , each factor contributes $\mathbb{E}\!\left[\exp\!\left(-\frac{|h_i|^2 d^2_\ell(\mu, i)}{4N_0}\right)\right] = \frac{1}{1 + \frac{d^2_\ell(\mu, i)}{4 N_0}}.$ At high SNR (small $N_0$ ) this is approximately $4 N_0 / d^2_\ell(\mu, i)$ , i.e., decays as $\text{SNR}^{-1}$ per bit position.

Step 3: Subset averaging inside a bit position

For a given bit position $\ell$ and a given bit flip $b \to \hat b$ , the "distance" $d^2_\ell(\mu, i)$ depends on which specific label pair $(s, \hat s) \in \mathcal{X}_\ell^{(b)} \times \mathcal{X}_\ell^{(\hat b)}$ was visited. A second averaging over these possibilities gives $\mathbb{E}_s\!\left[\frac{1}{1 + \frac{d^2_\ell(\mu, i)}{4 N_0}}\right] = \frac{1}{|\mathcal{X}_\ell^{(b)}|} \sum_{s \in \mathcal{X}_\ell^{(b)}} \prod_{\hat s \ne s} \left(1 + \frac{\|s - \hat s\|^2}{4 N_0}\right)^{-1}.$ At high SNR, the TOP $L_{\min}(\mu)$ factors of this product decay as $\text{SNR}^{-1}$ each, contributing $\text{SNR}^{-L_{\min}(\mu)}$ per bit position. Positions with fewer distinct alternatives (i.e., $\ell(\mu, b, \hat b) < L_{\min}$ ) do not exist by definition of $L_{\min}$ .

Step 4: Combine over $d$ bit disagreements

There are $d$ bit positions at which $\mathbf{c}$ and $\hat{\mathbf{c}}$ disagree, and by the fully-interleaved assumption each sees INDEPENDENT fading and INDEPENDENT label-pair averaging. Multiplying: $P(\mathbf{c} \to \hat{\mathbf{c}}) \le C(\mu) \cdot \text{SNR}^{-d \cdot L_{\min}(\mu)},$ where $C(\mu)$ is a labelling-dependent constant (the coding gain).

Step 5: Minimise over $d$ for the codeword error probability

The codeword error probability is dominated at high SNR by the pairs at the minimum Hamming distance $d = d_H$ . Therefore $P_e \approx \text{const} \cdot \text{SNR}^{-d_H \cdot L_{\min}(\mu)},$ which is the stated formula. $\blacksquare$

,

🎓CommIT Contribution(1998)

BICM Diversity Order on Fading Channels

G. Caire, G. Taricco, E. Biglieri — IEEE Trans. Inf. Theory, vol. 44, no. 3, pp. 927–946

Sections V and VI of the Caire–Taricco–Biglieri 1998 paper establish the diversity formula $d_{\rm BICM} = d_H \cdot L_{\min}(\mu)$ for BICM on fully-interleaved Rayleigh fading. Here $d_H$ is the minimum Hamming distance of the binary code and $L_{\min}(\mu)$ is the minimum, over bit positions $\ell$ , of the minimum number of DISTINCT constellation positions whose labels differ in bit $\ell$ between opposite bit values $b, \hat b$ .

For square QAM with Gray labelling, $L_{\min}(\mu_G) = 1$ , and the formula collapses to $d_{\rm BICM} = d_H$ : the binary code's Hamming distance IS the diversity order, directly. This is the clean operational conclusion that allows modern wireless standards to use a single LDPC or convolutional code across QAM orders without redesigning the fading performance: the code contributes $d_H$ , the Gray mapper contributes $L_{\min} = 1$ , and the product is the fading slope.

The theorem tells the code designer: on fully-interleaved fading, BICM with Gray labelling has exactly the same diversity as a binary code running over a parallel-channel model. No geometric labelling trickery is needed. This clarity is what made the BICM framework — rather than multilevel coding or trellis-coded modulation — the de facto standard for wireless communications after 2000. The result is among the most operationally important theorems in the theory of coded modulation, and the paper that introduced it remains one of the most-cited papers in IEEE Trans. Information Theory.

bicmdiversityfadingpepunion-boundView Paper →

Key Takeaway

The BICM diversity order is $d_{\rm BICM}(\mu) = d_H \cdot L_{\min}(\mu)$ . On a fully-interleaved Rayleigh fading channel, the high-SNR slope of the BICM BER curve is determined by the product of the binary code's minimum Hamming distance and the labelling's minimum-distinct-bit-position count. For Gray labelling on square QAM, $L_{\min}(\mu_G) = 1$ and the slope equals $d_H$ . Design rule on fading: pick a code with large $d_H$ . The labelling choice affects only the coding gain, not the diversity slope.

BICM BER on Rayleigh Fading: Diversity Slopes $d_H = 1, 2, \ldots, 10$

Average BER of BICM on fully-interleaved Rayleigh fading, computed from the MGF-based PEP bound, for a family of binary codes with minimum Hamming distances $d_H \in \{1, \ldots, d_{H,\max}\}$ under Gray labelling ( $L_{\min} = 1$ ). Each curve is asymptotically a straight line of slope $-d_H$ on the log–log plot — this is the diversity theorem made visible. Compare the curves to see that doubling $d_H$ doubles the diversity slope; increasing $M$ ( $\text{QAM}$ size) hurts the coding gain but does NOT change the slope.

Parameters

Max

d_H

shown10

QAM size

Gray vs Set-Partition on Rayleigh: Diversity Slope Is the Same

At fixed binary code $d_H$ , plot the BICM BER on fully-interleaved Rayleigh fading for Gray and Set-Partition labellings on 16-QAM. Both curves have slope $-d_H$ at high SNR (since $L_{\min}(\mu_G) = L_{\min}(\mu_{\rm SP}) = 1$ for 16-QAM), but Gray is a constant number of dB better because its coding gain is higher. This is a different story from AWGN (where Gray won on slope — no, slope is infinite on AWGN — wins on exponent), and different from BICM-ID (Ch. 8) where SP wins at low BER once iterative feedback is added.

Parameters

Code Hamming distance

d_H

5

Example: Computing $L_{\min}(\mu_G)$ for Gray 16-QAM

Show directly from the definition that $L_{\min}(\mu_G) = 1$ for square Gray-labelled 16-QAM, and compute $d_{\rm BICM}(\mu_G)$ for a rate- $1/2$ , constraint-length-5 convolutional code with free distance $d_{H,{\rm free}} = 7$ .

Solution

Gray 16-QAM labelling structure

Arrange the 16 constellation points on a $4 \times 4$ grid with the standard Gray-coded PAM labelling on each axis. Bit 0 (I-axis LSB) flips between "left outer" and "left inner" — specifically, the constellation points $(-3, Q)$ and $(-1, Q)$ have bit 0 opposite, for every $Q \in \{-3, -1, 1, 3\}$ .

Find a bit position with a single-distinct-pair flip

Look at bit 0 of the I-axis label. For $b = 0$ we have the set $\{(-3, Q), (1, Q) : Q\}$ ; for $\hat b = 1$ , the set $\{(-1, Q), (3, Q) : Q\}$ . For the specific points $s = (-3, Q), \hat s = (-1, Q)$ — same $Q$ , adjacent I-axis bits differing in bit 0 — we have only ONE pair at the nearest-neighbour distance $d_{\min} = 2$ . Therefore $\ell(\mu_G, 0, b=0, \hat b = 1) = 1$ for this specific bit-flip direction involving nearest neighbours.

Verify for other bit positions

By the symmetry of the Gray-PAM labelling, the same single-pair condition holds at bits 1, 2, 3 (with appropriate relabelling for the Q-axis). Hence $L_{\min}(\mu_G) = \min_\ell \ell(\mu_G, \ell, b, \hat b) = 1$ .

Diversity order of the convolutional BICM

By Thm. 3, on fully-interleaved Rayleigh the convolutional-code + Gray-16-QAM BICM has diversity order $d_{\rm BICM} = d_{H,{\rm free}} \cdot L_{\min}(\mu_G) = 7 \cdot 1 = 7.$ This is the same diversity order as the binary code running over seven independent Rayleigh fades — the mapper contributes no additional geometric diversity, only coding gain.

Example: Diversity of a Rate-1/2, K=5 Convolutional Code with 64-QAM BICM

The industry-standard rate- $1/2$ , constraint-length- $5$ convolutional code (generators $(23, 35)_{\rm oct}$ , NASA/CCSDS, widely used in HSPA and satellite) has free distance $d_{H,{\rm free}} = 7$ . It is paired with 64-QAM and Gray labelling in a BICM system. Compute the BICM diversity order on fully-interleaved Rayleigh fading, and estimate the SNR needed for $P_b \approx 10^{-5}$ .

Solution

Diversity order

For Gray-labelled 64-QAM, $L_{\min}(\mu_G) = 1$ (same argument as the 16-QAM example: each bit position has nearest-neighbour pairs at the I- or Q-minimum-distance). Hence $d_{\rm BICM} = d_{H, {\rm free}} \cdot L_{\min}(\mu_G) = 7 \cdot 1 = 7.$

High-SNR BER estimate

The dominant-pair approximation gives $P_b \approx \frac{c_{d_{\rm free}}}{k} \cdot (4 \text{SNR})^{-d_{\rm free} L_{\min}},$ where $c_{d_{\rm free}}$ is the input-weight-at-free-distance coefficient (for this code, $c_7 = 4$ , from the standard transfer function) and $k = 1$ (rate $1/2$ , binary input). Setting $P_b = 10^{-5}$ , $d_{\rm BICM} = 7$ : $10^{-5} \approx 4 (4 \text{SNR})^{-7} \implies \text{SNR} \approx (4 \cdot 10^5)^{1/7} / 4 \approx 1.5 \cdot 10^0 \text{ linear} \approx 1.8 \text{ dB plus hidden constants}.$ In practice the hidden constants and 64-QAM spectral-efficiency penalty push the required $E_s/N_0$ to about $25$ dB at $10^{-5}$ BER — consistent with LTE 64-QAM field measurements in $\sim 100 \text{ km/h}$ mobility.

Comparison: what would diversity $d = 1$ cost?

An uncoded 64-QAM would have diversity $d = 1$ , so $P_b \approx C / (4 \text{SNR})$ . For $P_b = 10^{-5}$ this requires $\text{SNR} \approx 10^{5}/4 \approx 44 \text{ dB}$ — roughly $19$ dB worse than the coded system at the same spectral efficiency. The convolutional code's $d_{H,{\rm free}} = 7$ converts to a $\sim 19$ dB diversity-gain advantage. This is the entire value proposition of BICM on fading.

Common Mistake: The $\text{SNR}^{-d_H}$ Slope Only Shows at High SNR

Mistake:

A common trap in reading simulation curves is to look at BER $\sim 10^{-2}$ and try to read off the slope — then conclude "the code has diversity order 3" because that matches the slope locally.

Correction:

Thm. 3 predicts the ASYMPTOTIC slope: $d_{\rm BICM} = d_H \cdot L_{\min}(\mu)$ as $\text{SNR} \to \infty$ . At BERs around $10^{-2}$ the curve is still in the "waterfall" region where many error patterns contribute non-negligibly and the union bound is loose. The asymptotic slope becomes visible only in the "error floor" region, typically below $10^{-5}$ for practical codes. Simulation curves plotted in $\text{BER}\text{ vs }\text{SNR}_{\rm dB}$ log–dB coordinates should be checked at several SNR values to confirm the slope has settled.

Why This Matters: From Convolutional Codes to Modern LDPC: Diversity Still Rules

The diversity theorem in this section uses minimum Hamming distance $d_H$ — which was the right invariant for convolutional codes in the 1990s, where $d_{H,{\rm free}} = 7$ or $10$ was the design target. Modern wireless standards use LDPC codes (LTE, 5G NR, Wi-Fi) and polar codes (5G NR control channels), where "minimum distance" is replaced by the smaller-weight codewords in the weight enumerator $W_d$ . The same formula still applies operationally: the asymptotic diversity slope is governed by the smallest $d$ with $W_d > 0$ , which for a well-designed LDPC code is typically $15$ – $50$ . The product $d_H \cdot L_{\min}$ stays the right invariant, and the fact that $L_{\min} = 1$ for Gray QAM is why LDPC-BICM achieves essentially the LDPC's own diversity regardless of QAM order. This modularity — "plug any good binary code into the Gray-QAM BICM frontend" — is the engineering payoff of the diversity theorem.

Quick Check

For Gray-labelled 64-QAM BICM with a rate- $1/3$ , constraint-length- $7$ convolutional code ( $d_{H, {\rm free}} = 15$ ) on fully-interleaved Rayleigh fading, what is the diversity order $d_{\rm BICM}$ ?

$3$

$6$

$15$

$90$

Correction:

15

By Thm. 3, $d_{\rm BICM} = d_{H, {\rm free}} \cdot L_{\min}(\mu_G) = 15 \cdot 1 = 15$ on Gray-labelled 64-QAM (the fact that $L_{\min}(\mu_G) = 1$ for any square QAM is the reason QAM order doesn't matter for diversity).

Diversity Order

The high-SNR slope of a fading-channel error-probability curve: $d = \lim_{\text{SNR} \to \infty} -\log P_e/\log \text{SNR}$ . For BICM with labelling $\mu$ and binary code of minimum Hamming distance $d_H$ on fully-interleaved Rayleigh fading, $d = d_H \cdot L_{\min}(\mu)$ .

Coding Gain

At fixed diversity order $d$ , the horizontal (SNR-axis) offset between the BER curve of a coded scheme and an uncoded reference with the same diversity $d$ , both plotted in dB–dB coordinates. Measured in dB. For BICM, coding gain depends on labelling through $d^2_{\rm avg}(\mu)$ and through the multiplicities $W_d$ of the binary code's weight spectrum.

BICM on Fading Channels: The Diversity Theorem

From AWGN to Fading: What Changes

Definition: Fully-Interleaved Rayleigh Fading Channel

Definition: Diversity Order

Definition: Minimum Number of Distinct Bit Positions Lmin⁡(μ)L_{\min}(\mu)Lmin​(μ)

Theorem: BICM Diversity Order on Fully-Interleaved Rayleigh Fading

Step 1: Conditional PEP given fading

Step 2: Averaging over Rayleigh fading

Step 3: Subset averaging inside a bit position

Step 4: Combine over $d$ bit disagreements

Step 5: Minimise over $d$ for the codeword error probability

BICM Diversity Order on Fading Channels

Key Takeaway

BICM BER on Rayleigh Fading: Diversity Slopes dH=1,2,…,10d_H = 1, 2, \ldots, 10dH​=1,2,…,10

Parameters

Gray vs Set-Partition on Rayleigh: Diversity Slope Is the Same

Parameters

Example: Computing Lmin⁡(μG)L_{\min}(\mu_G)Lmin​(μG​) for Gray 16-QAM

Gray 16-QAM labelling structure

Find a bit position with a single-distinct-pair flip

Verify for other bit positions

Diversity order of the convolutional BICM

Example: Diversity of a Rate-1/2, K=5 Convolutional Code with 64-QAM BICM

Diversity order

High-SNR BER estimate

Comparison: what would diversity $d = 1$ cost?

Common Mistake: The SNR−dH\text{SNR}^{-d_H}SNR−dH​ Slope Only Shows at High SNR

Why This Matters: From Convolutional Codes to Modern LDPC: Diversity Still Rules

Quick Check

Diversity Order

Coding Gain

Definition:
Fully-Interleaved Rayleigh Fading Channel

Definition:
Diversity Order

Definition:
Minimum Number of Distinct Bit Positions $L_{\min}(\mu)$

BICM BER on Rayleigh Fading: Diversity Slopes $d_H = 1, 2, \ldots, 10$

Example: Computing $L_{\min}(\mu_G)$ for Gray 16-QAM

Common Mistake: The $\text{SNR}^{-d_H}$ Slope Only Shows at High SNR