BICM on Fading Channels: The Diversity Theorem

From AWGN to Fading: What Changes

Everything in s01–s02 was AWGN: a deterministic channel where the PEP decays as exp(αEs/N0)\exp(-\alpha E_s/N_0) with a constant α\alpha, and where "bigger α\alpha" is the only design knob. On a fading channel, the physics is different: the received signal is y=hs+wy = h \cdot s + w with random amplitude h|h|, and when h|h| is small the entire transmission is lost. The PEP is then an EXPECTATION over the fading realisation, and its high-SNR behaviour is not exponential but power-law: Peγc1SNRdP_e \approx \gamma_c^{-1} \cdot \text{SNR}^{-d}. The exponent dd is the diversity order — the number of independent fading samples that must all be bad for a decoding error to occur.

The central result of this section — and arguably the central result of the entire Caire–Taricco–Biglieri 1998 paper — is an exact formula for the diversity order of BICM: dBICM(μ)=dHLmin(μ).d_{\rm BICM}(\mu) = d_H \cdot L_{\min}(\mu). This formula replaces davg2(μ)d^2_{\rm avg}(\mu) as the primary design criterion on fading. It tells the designer: pick a binary code with large dHd_H and a labelling with large LminL_{\min}. The "diversity product" is what you want to maximise, not the AWGN exponent.

The point is that on fading, capacity alone does not tell the full story. Even if BICM's capacity is close to CM capacity, the operating SNR at a target BER depends on the DIVERSITY order achievable. The formula above is the design criterion, and it factors cleanly into "binary code" and "labelling" contributions — which is exactly what makes BICM modular.

Definition:

Fully-Interleaved Rayleigh Fading Channel

The fully-interleaved Rayleigh fading channel has input-output relation at symbol time ii yi=hisi+wi,y_i = h_i \cdot s_i + w_i, where siXs_i \in \mathcal{X} is the transmitted constellation symbol, hiCh_i \in \mathbb{C} is the channel gain, and wiCN(0,N0)w_i \sim \mathcal{CN}(0, N_0) is complex Gaussian noise. The fading amplitudes satisfy hi2Exp(1)|h_i|^2 \sim \text{Exp}(1) (unit-mean exponential, the squared magnitude of a unit-variance CN\mathcal{CN} random variable), and the sequence {hi}\{h_i\} is i.i.d. — i.e., the interleaver is long enough to fully de-correlate consecutive symbol fadings. The receiver has perfect channel state information (CSI) {hi}\{h_i\}.

"Fully interleaved" is an idealisation: it assumes the interleaver length NN is infinite relative to the coherence time TcT_c. Section 5 quantifies what happens when NN is finite.

,

Definition:

Diversity Order

The diversity order of a scheme over a fading channel is dlimSNRlogPe(SNR)logSNR,d \triangleq \lim_{\text{SNR} \to \infty} \frac{-\log P_e(\text{SNR})}{\log \text{SNR}}, provided the limit exists. Operationally, dd is the high-SNR slope of the log–log BER curve. A BER versus SNR\text{SNR} plot in dB–dB coordinates becomes a straight line at high SNR with slope d-d. The coding gain γc\gamma_c is the horizontal (SNR) offset of that line relative to a reference curve with the same slope.

On AWGN the diversity order is infinite (the BER decays exponentially, not polynomially) — dd is a fading-specific concept. For an uncoded system with a single antenna, d=1d = 1 (one Rayleigh fade can wipe out the symbol). To exceed d=1d = 1 one must spread each information bit across multiple independent fading realisations — which is exactly what a binary code with dHd_H bit-disagreements does when bits are fully interleaved.

Definition:

Minimum Number of Distinct Bit Positions Lmin(μ)L_{\min}(\mu)

For a labelling μ:{0,1}LX\mu : \{0,1\}^L \to \mathcal{X}, define (μ,b,b^)#{DISTINCT constellation pairs (s,s^):μ1(s) has bit value b,  μ1(s^) has bit value b^, at the same position}\ell(\mu, b, \hat b) \triangleq \#\{\text{DISTINCT constellation pairs } (s, \hat s) : \mu^{-1}(s) \text{ has bit value } b, \; \mu^{-1}(\hat s) \text{ has bit value } \hat b, \text{ at the same position}\} and the minimum number of distinct bit positions Lmin(μ)min{0,,L1}minbb^(μ,b,b^).L_{\min}(\mu) \triangleq \min_{\ell \in \{0, \ldots, L-1\}} \min_{b \ne \hat b} \ell(\mu, b, \hat b). Informally, Lmin(μ)L_{\min}(\mu) counts the smallest number of distinct constellation positions that differ in a single label bit across label pairs, minimised over bit positions and bit-value flips.

For most "reasonable" labellings on symmetric QAM/PSK constellations, Lmin(μ)=1L_{\min}(\mu) = 1 — a single bit flip often corresponds to a single nearest-neighbour move. Both Gray and Set-Partition labellings on square QAM achieve Lmin=1L_{\min} = 1. A labelling with Lmin=2L_{\min} = 2 would need every single bit flip to move across at least two distinct constellation positions simultaneously, a rare property.

Theorem: BICM Diversity Order on Fully-Interleaved Rayleigh Fading

Consider BICM with labelling μ\mu and binary code of minimum Hamming distance dHd_H on a fully-interleaved Rayleigh fading channel with receiver CSI. For any two codewords c,c^\mathbf{c}, \hat{\mathbf{c}} at Hamming distance dd, the PEP is bounded at high SNR by P(cc^fading)(11+SNR4(μ)davg,fad2(μ))dLmin(μ)P(\mathbf{c} \to \hat{\mathbf{c}} \mid \text{fading}) \le \left(\frac{1}{1 + \frac{\text{SNR}}{4} \ell(\mu) d^2_{\rm avg,fad}(\mu)}\right)^{d \cdot L_{\min}(\mu)} up to constants, where (μ)\ell(\mu) and davg,fad2(μ)d^2_{\rm avg, fad}(\mu) are fading-specific averaging quantities. Consequently the diversity order of the BICM codeword error probability is   dBICM(μ)=dHLmin(μ).  \boxed{\; d_{\rm BICM}(\mu) = d_H \cdot L_{\min}(\mu). \;} For Gray labelling on square QAM, Lmin(μG)=1L_{\min}(\mu_G) = 1, so dBICM(μG)=dHd_{\rm BICM}(\mu_G) = d_H: the binary code's minimum Hamming distance converts directly into diversity order.

Here is the mental picture. The fully-interleaved channel turns the single fading channel into dHd_H independent parallel sub-channels — one per differing coded bit. Each differing bit position is mapped to a label position with Lmin(μ)L_{\min}(\mu) distinct constellation choices; in the Gray case, Lmin=1L_{\min} = 1 means each such bit goes through a single fading realisation. The diversity order is the number of independent fadings the error event must jointly survive, which is dHLmin(μ)d_H \cdot L_{\min}(\mu).

Compare this with the CM (no-BICM) case: on fading, TCM with SP labelling achieves diversity min(dH,Lfree)\min(d_H, L_{\rm free}) where LfreeL_{\rm free} is the trellis free distance in symbols. BICM with Gray achieves exactly dHd_H — it trades the SP geometric structure for a simpler, more interleaver-friendly one.

,
🎓CommIT Contribution(1998)

BICM Diversity Order on Fading Channels

G. Caire, G. Taricco, E. BiglieriIEEE Trans. Inf. Theory, vol. 44, no. 3, pp. 927–946

Sections V and VI of the Caire–Taricco–Biglieri 1998 paper establish the diversity formula dBICM=dHLmin(μ)d_{\rm BICM} = d_H \cdot L_{\min}(\mu) for BICM on fully-interleaved Rayleigh fading. Here dHd_H is the minimum Hamming distance of the binary code and Lmin(μ)L_{\min}(\mu) is the minimum, over bit positions \ell, of the minimum number of DISTINCT constellation positions whose labels differ in bit \ell between opposite bit values b,b^b, \hat b.

For square QAM with Gray labelling, Lmin(μG)=1L_{\min}(\mu_G) = 1, and the formula collapses to dBICM=dHd_{\rm BICM} = d_H: the binary code's Hamming distance IS the diversity order, directly. This is the clean operational conclusion that allows modern wireless standards to use a single LDPC or convolutional code across QAM orders without redesigning the fading performance: the code contributes dHd_H, the Gray mapper contributes Lmin=1L_{\min} = 1, and the product is the fading slope.

The theorem tells the code designer: on fully-interleaved fading, BICM with Gray labelling has exactly the same diversity as a binary code running over a parallel-channel model. No geometric labelling trickery is needed. This clarity is what made the BICM framework — rather than multilevel coding or trellis-coded modulation — the de facto standard for wireless communications after 2000. The result is among the most operationally important theorems in the theory of coded modulation, and the paper that introduced it remains one of the most-cited papers in IEEE Trans. Information Theory.

bicmdiversityfadingpepunion-boundView Paper →

Key Takeaway

The BICM diversity order is dBICM(μ)=dHLmin(μ)d_{\rm BICM}(\mu) = d_H \cdot L_{\min}(\mu). On a fully-interleaved Rayleigh fading channel, the high-SNR slope of the BICM BER curve is determined by the product of the binary code's minimum Hamming distance and the labelling's minimum-distinct-bit-position count. For Gray labelling on square QAM, Lmin(μG)=1L_{\min}(\mu_G) = 1 and the slope equals dHd_H. Design rule on fading: pick a code with large dHd_H. The labelling choice affects only the coding gain, not the diversity slope.

BICM BER on Rayleigh Fading: Diversity Slopes dH=1,2,,10d_H = 1, 2, \ldots, 10

Average BER of BICM on fully-interleaved Rayleigh fading, computed from the MGF-based PEP bound, for a family of binary codes with minimum Hamming distances dH{1,,dH,max}d_H \in \{1, \ldots, d_{H,\max}\} under Gray labelling (Lmin=1L_{\min} = 1). Each curve is asymptotically a straight line of slope dH-d_H on the log–log plot — this is the diversity theorem made visible. Compare the curves to see that doubling dHd_H doubles the diversity slope; increasing MM (QAM\text{QAM} size) hurts the coding gain but does NOT change the slope.

Parameters
10

Gray vs Set-Partition on Rayleigh: Diversity Slope Is the Same

At fixed binary code dHd_H, plot the BICM BER on fully-interleaved Rayleigh fading for Gray and Set-Partition labellings on 16-QAM. Both curves have slope dH-d_H at high SNR (since Lmin(μG)=Lmin(μSP)=1L_{\min}(\mu_G) = L_{\min}(\mu_{\rm SP}) = 1 for 16-QAM), but Gray is a constant number of dB better because its coding gain is higher. This is a different story from AWGN (where Gray won on slope — no, slope is infinite on AWGN — wins on exponent), and different from BICM-ID (Ch. 8) where SP wins at low BER once iterative feedback is added.

Parameters
5

Example: Computing Lmin(μG)L_{\min}(\mu_G) for Gray 16-QAM

Show directly from the definition that Lmin(μG)=1L_{\min}(\mu_G) = 1 for square Gray-labelled 16-QAM, and compute dBICM(μG)d_{\rm BICM}(\mu_G) for a rate-1/21/2, constraint-length-5 convolutional code with free distance dH,free=7d_{H,{\rm free}} = 7.

Example: Diversity of a Rate-1/2, K=5 Convolutional Code with 64-QAM BICM

The industry-standard rate-1/21/2, constraint-length-55 convolutional code (generators (23,35)oct(23, 35)_{\rm oct}, NASA/CCSDS, widely used in HSPA and satellite) has free distance dH,free=7d_{H,{\rm free}} = 7. It is paired with 64-QAM and Gray labelling in a BICM system. Compute the BICM diversity order on fully-interleaved Rayleigh fading, and estimate the SNR needed for Pb105P_b \approx 10^{-5}.

Common Mistake: The SNRdH\text{SNR}^{-d_H} Slope Only Shows at High SNR

Mistake:

A common trap in reading simulation curves is to look at BER 102\sim 10^{-2} and try to read off the slope — then conclude "the code has diversity order 3" because that matches the slope locally.

Correction:

Thm. 3 predicts the ASYMPTOTIC slope: dBICM=dHLmin(μ)d_{\rm BICM} = d_H \cdot L_{\min}(\mu) as SNR\text{SNR} \to \infty. At BERs around 10210^{-2} the curve is still in the "waterfall" region where many error patterns contribute non-negligibly and the union bound is loose. The asymptotic slope becomes visible only in the "error floor" region, typically below 10510^{-5} for practical codes. Simulation curves plotted in BER vs SNRdB\text{BER}\text{ vs }\text{SNR}_{\rm dB} log–dB coordinates should be checked at several SNR values to confirm the slope has settled.

Why This Matters: From Convolutional Codes to Modern LDPC: Diversity Still Rules

The diversity theorem in this section uses minimum Hamming distance dHd_H — which was the right invariant for convolutional codes in the 1990s, where dH,free=7d_{H,{\rm free}} = 7 or 1010 was the design target. Modern wireless standards use LDPC codes (LTE, 5G NR, Wi-Fi) and polar codes (5G NR control channels), where "minimum distance" is replaced by the smaller-weight codewords in the weight enumerator WdW_d. The same formula still applies operationally: the asymptotic diversity slope is governed by the smallest dd with Wd>0W_d > 0, which for a well-designed LDPC code is typically 15155050. The product dHLmind_H \cdot L_{\min} stays the right invariant, and the fact that Lmin=1L_{\min} = 1 for Gray QAM is why LDPC-BICM achieves essentially the LDPC's own diversity regardless of QAM order. This modularity — "plug any good binary code into the Gray-QAM BICM frontend" — is the engineering payoff of the diversity theorem.

Quick Check

For Gray-labelled 64-QAM BICM with a rate-1/31/3, constraint-length-77 convolutional code (dH,free=15d_{H, {\rm free}} = 15) on fully-interleaved Rayleigh fading, what is the diversity order dBICMd_{\rm BICM}?

33

66

1515

9090

Diversity Order

The high-SNR slope of a fading-channel error-probability curve: d=limSNRlogPe/logSNRd = \lim_{\text{SNR} \to \infty} -\log P_e/\log \text{SNR}. For BICM with labelling μ\mu and binary code of minimum Hamming distance dHd_H on fully-interleaved Rayleigh fading, d=dHLmin(μ)d = d_H \cdot L_{\min}(\mu).

Related: Coding Gain, BICM Codeword Pair and Hamming Distance, Rayleigh Fading

Coding Gain

At fixed diversity order dd, the horizontal (SNR-axis) offset between the BER curve of a coded scheme and an uncoded reference with the same diversity dd, both plotted in dB–dB coordinates. Measured in dB. For BICM, coding gain depends on labelling through davg2(μ)d^2_{\rm avg}(\mu) and through the multiplicities WdW_d of the binary code's weight spectrum.

Related: Diversity Order, Average Squared Intra-Subset Euclidean Distance davg2(μ,)d^2_{\rm avg}(\mu, \ell), Weight Enumerator and Input-Weight Multiplicities