Ferkans — Interactive Telecom Tutor

ex-ch06-01

Easy

Compute the Bhattacharyya factor $\beta$ for the BI-AWGN channel (BPSK over AWGN) with $E_s/N_0 = \gamma$ . Show that the PEP of a binary block code with minimum Hamming distance $d_H$ over this channel is upper-bounded by $\beta^{d_H}$ .

Show Hint

For BI-AWGN, $\beta = \int \sqrt{p(y|+\sqrt{E_s}) p(y|-\sqrt{E_s})} \,dy$ , which has a closed form in terms of the Gaussian pdf.

Complete the square in the Gaussian product inside the integral.

Solution

Direct computation of $\beta$

$\beta = \int \frac{1}{\sqrt{2\pi\sigma^2}} \exp\!\left(-\frac{(y - \sqrt{E_s})^2 + (y + \sqrt{E_s})^2}{4\sigma^2}\right) dyKATEXPLACEHOLDER0END\beta = \exp(-E_s/(2\sigma^2)) \int \frac{1}{\sqrt{2\pi\sigma^2}} \exp(-y^2/(2\sigma^2))\, dy \cdot \frac{1}{\sqrt{2}} = \exp(-\gamma).$ $

Apply to $d_H$ disagreements

By independence of AWGN noise samples, the PEP for $d_H$ differing positions is $\beta^{d_H} = \exp(-d_H \gamma)$ . The Q-function form is $Q(\sqrt{2 d_H \gamma})$ , which has the same exponential decay rate. $\blacksquare$

ex-ch06-02

Easy

Show that $Q(x) \le \frac{1}{2}\exp(-x^2/2)$ for all $x \ge 0$ , using the bound $\int_x^\infty e^{-t^2/2}\,dt \le \frac{1}{x}\int_x^\infty te^{-t^2/2}\,dt$ for $x > 0$ and direct integration.

Show Hint

$\int_x^\infty te^{-t^2/2} dt = e^{-x^2/2}$ .

Combine with the inequality $\int_x^\infty e^{-t^2/2} dt \le \int_x^\infty (t/x) e^{-t^2/2} dt$ .

Solution

Evaluate the integral

$\int_x^\infty t e^{-t^2/2} dt = [-e^{-t^2/2}]_x^\infty = e^{-x^2/2}$ .

Bound $Q$

For $x > 0$ , $1 \le t/x$ on $[x, \infty)$ , hence $\int_x^\infty e^{-t^2/2} dt \le \int_x^\infty (t/x) e^{-t^2/2} dt = e^{-x^2/2}/x$ . Dividing by $\sqrt{2\pi}$ : $Q(x) \le e^{-x^2/2}/(x\sqrt{2\pi})$ . For $x \ge \sqrt{2/\pi}$ , $1/(x\sqrt{2\pi}) \le 1/2$ , giving $Q(x) \le \frac{1}{2} e^{-x^2/2}$ . A separate direct argument (Chernoff) covers $x \in [0, \sqrt{2/\pi}]$ . $\blacksquare$

ex-ch06-03

Medium

For the 8-PSK constellation with the Gray labelling $\mu_G$ , compute $d^2_{\rm avg}(\mu_G, \ell)$ for $\ell = 0, 1, 2$ , and verify $d^2_{\rm avg}(\mu_G) = 1.17$ (for unit average energy).

Show Hint

The 8-PSK points are $\sqrt{E_s} e^{j k \pi/4}$ for $k = 0, \ldots, 7$ , with Gray labels $000, 001, 011, 010, 110, 111, 101, 100$ .

For each bit position $\ell$ , partition the 8 points into $\mathcal{X}_\ell^{(0)}$ and $\mathcal{X}_\ell^{(1)}$ (each of size 4), and compute the sum of squared distances across the partition.

Solution

Bit 0 (MSB): opposite halves of the circle

Bit 0 partitions into $\{k \in \{0, 1, 2, 3\}\}$ vs $\{k \in \{4, 5, 6, 7\}\}$ — i.e., the upper and lower semicircles. For each pair across the partition, $\|s - \hat s\|^2 = 2E_s(1 - \cos\Delta\phi)$ where $\Delta\phi \in \{5\pi/4, 3\pi/4, \pi/4, -\pi/4\}$ . Computing: $d^2_{\rm avg}(\mu_G, 0) = 1.17 E_s$ (numerical).

Bits 1, 2 by symmetry

By the rotational symmetry of Gray labelling on 8-PSK, the other two bit positions give the same value $d^2_{\rm avg}(\mu_G, \ell) = 1.17 E_s$ for $\ell = 1, 2$ .

Average

$d^2_{\rm avg}(\mu_G) = (1.17 + 1.17 + 1.17)/3 = 1.17 E_s$ . Setting $E_s = 1$ gives $d^2_{\rm avg}(\mu_G) = 1.17$ . $\blacksquare$

ex-ch06-04

Medium

Verify that $L_{\min}(\mu_G) = 1$ for Gray-labelled 64-QAM by exhibiting a bit position $\ell$ and a bit flip $(b, \hat b)$ such that exactly ONE distinct constellation pair differs in that bit.

Show Hint

Pick the LSB of the in-phase PAM component; Gray coding ensures adjacent PAM points flip this bit.

Show that the pair $(s = (-5, Q), \hat s = (-3, Q))$ for some specific $Q$ is the only one at minimum I-axis distance for that bit flip.

Solution

Gray-coded 8-PAM structure

The in-phase component of 64-QAM is 8-PAM, with Gray-coded labels for 8-PAM. The LSB of the Gray-8-PAM code flips between adjacent points $(-7 \to -5), (-5 \to -3), \ldots, (5 \to 7)$ — a total of SEVEN adjacent pairs (for each specific value of the Q-axis, 8 of them).

Identify a single-pair bit flip

Focus on one specific Q-axis value, say $Q = -3$ . Within the row $\{(-7, -3), (-5, -3), (-3, -3), (-1, -3), (1, -3), (3, -3), (5, -3), (7, -3)\}$ , the LSB of the I-axis label flips between ADJACENT PAM points — and each flip corresponds to EXACTLY ONE distinct pair (not multiple pairs at the same distance). Hence $\ell(\mu_G, 0, 0, 1) = 1$ for this specific flip at this specific row. Therefore $L_{\min}(\mu_G) \le 1$ , and since $L_{\min} \ge 1$ trivially, $L_{\min}(\mu_G) = 1$ .

Consequence

$d_{\rm BICM}(\mu_G) = d_H \cdot L_{\min}(\mu_G) = d_H \cdot 1 = d_H$ for any binary code. Larger QAM orders contribute no additional diversity — the code is carrying the load. $\blacksquare$

ex-ch06-05

Medium

Compute the diversity order $d_{\rm BICM}$ on fully-interleaved Rayleigh fading for the rate- $1/2$ , $d_{H, {\rm free}} = 7$ convolutional code (from the $(23, 35)_8$ polynomials) paired with Gray 16-QAM, 64-QAM, and 256-QAM. Explain why the answer is the same for all three.

Show Hint

Use Thm. 3 of s03: $d_{\rm BICM} = d_H \cdot L_{\min}(\mu)$ .

$L_{\min}(\mu_G) = 1$ for any Gray-labelled square QAM, regardless of size.

Solution

Per-QAM $L_{\min}$

For Gray-labelled square QAM of any size, the LSB of the in-phase (and quadrature) Gray-PAM component has a single-pair bit flip (adjacent PAM pair). Hence $L_{\min}(\mu_G) = 1$ for 16-QAM, 64-QAM, and 256-QAM.

Diversity order

$d_{\rm BICM} = 7 \cdot 1 = 7$ for all three QAM orders.

Operational interpretation

Increasing the QAM order raises the data RATE but does NOT raise the diversity order. What it DOES raise is the spectral efficiency at the cost of coding gain (the slope is the same, the intercept moves right). The code's $d_H$ is the diversity; the QAM order is the efficiency. This is the central appeal of BICM: they decouple. $\blacksquare$

ex-ch06-06

Medium

For an LTE-style turbo code with $d_{H, {\rm free}} \approx 20$ paired with Gray-QPSK on a Rayleigh-fading channel with coherence time $T_c = 120$ symbols (corresponding to 10 km/h mobility at 2 GHz, LTE), compute the minimum interleaver length $N$ needed to achieve full diversity. Comment on whether LTE's 1-ms TTI (corresponding to about $N = 1200$ symbols in each sub-block) is sufficient.

Show Hint

Apply Thm. 5 of s05: full diversity requires $N_{\rm eff} \ge d_H$ , i.e., $N \ge d_H T_c$ .

Solution

Minimum $N$

$N \ge d_H T_c = 20 \cdot 120 = 2400$ symbols.

Compare with LTE TTI

LTE's 1-ms TTI at 1.08 MHz effective bandwidth provides $N \approx 1200$ symbols within a single transmission — only about $N_{\rm eff} = 10$ coherence blocks. This is HALF of what the turbo code's full $d_H = 20$ diversity would need. Consequence: a single TTI transmission achieves only $d_{\rm eff} = 10$ ; LTE HARQ is used to stretch the effective codeword across multiple TTIs, recovering the full $d_H$ at the cost of latency.

Design lesson

At very low mobility ( $T_c$ large), the single-TTI interleaver is insufficient and HARQ is critical. At high mobility ( $T_c$ small), single-TTI suffices. This matches 3GPP field observations: LTE is most robust in the intermediate mobility range (5–30 km/h), where the interleaver and coherence time are matched. $\blacksquare$

ex-ch06-07

Medium

Using the Rayleigh-fading leading-term PEP $P(d) \approx \binom{2d-1}{d} (4\gamma)^{-d}$ and a convolutional code with $c_d = c_7 = 4$ , $c_8 = 12$ , $c_9 = 20$ (other terms negligible at high SNR), compute the BER at $\gamma = 20 \text{ dB}$ . Use this to estimate the coding gain over an uncoded system operating at the same spectral efficiency.

Show Hint

Evaluate each $c_d P(d)$ separately and sum.

Uncoded Rayleigh QPSK has $P_b \approx 1/(4\gamma)$ at high SNR.

Solution

Evaluate PEPs

At $\gamma = 20$ dB $= 100$ , $4\gamma = 400$ .

$P(7) = \binom{13}{7}/400^7 = 1716 / 1.64\times10^{18} \approx 1.05 \times 10^{-15}$ .
$P(8) = \binom{15}{8}/400^8 = 6435/6.55\times10^{20} \approx 9.8 \times 10^{-18}$ .
$P(9) \approx 48620/2.62\times10^{23} \approx 1.86 \times 10^{-19}$ .

Union bound

$P_b \le c_7 P(7) + c_8 P(8) + c_9 P(9) \approx 4 \cdot 1.05 \times 10^{-15} + 12 \cdot 9.8 \times 10^{-18} + \ldots \approx 4.3 \times 10^{-15}.$ $Dominated by the$ d = 7$ term.

Coding gain estimate

At $\gamma = 20$ dB, uncoded QPSK-Rayleigh has $P_b \approx 1/400 \approx 2.5 \times 10^{-3}$ . Coded BICM achieves $P_b \approx 10^{-15}$ at the same $\gamma$ — 12 orders of magnitude smaller. Equivalently, the coded system reaches $P_b = 10^{-5}$ at roughly $\gamma = 4 \text{ dB}$ , compared with $\gamma = 45 \text{ dB}$ uncoded. Coding gain: about $41 \text{ dB}$ at $P_b = 10^{-5}$ (dominated by diversity; the asymptotic coding gain beyond the diversity slope is a few dB). $\blacksquare$

ex-ch06-08

Medium

Prove that for any labelling $\mu$ of an $M$ -point constellation, $\sum_{\ell = 0}^{L-1} d^2_{\rm avg}(\mu, \ell) = L \cdot \mathbb{E}_{s, \hat s}[\|s - \hat s\|^2]$ , where the expectation is over the UNIFORM distribution on all ordered pairs $(s, \hat s)$ with $s \ne \hat s$ in $\mathcal{X}$ . Use this to show that the SUM of $d^2_{\rm avg}(\mu, \ell)$ over bits is labelling-independent.

Show Hint

For each ordered pair $(s, \hat s)$ with $s \ne \hat s$ , its total contribution to $\sum_\ell d^2_{\rm avg}(\mu, \ell)$ is proportional to the NUMBER of bit positions where the two labels differ, i.e., the Hamming distance of their labels.

Sum over ordered pairs: $\sum_{s \ne \hat s} d_H(\mu^{-1}(s), \mu^{-1}(\hat s)) \|s - \hat s\|^2$ and observe that $\sum_{s \ne \hat s} d_H$ over all $s, \hat s$ is labelling-independent (it equals $M(M-1) L/2$ ).

Solution

Accounting by ordered pair

Each ordered pair $(s, \hat s)$ contributes $\|s - \hat s\|^2$ to $d^2_{\rm avg}(\mu, \ell)$ if they differ in bit $\ell$ , else 0. Summing over $\ell$ and over pairs: $\sum_\ell \sum_{(s, \hat s)} \mathbb{1}\{\mu_\ell(s) \ne \mu_\ell(\hat s)\} \|s - \hat s\|^2 = \sum_{(s, \hat s)} d_H(\mu^{-1}(s), \mu^{-1}(\hat s)) \|s - \hat s\|^2.$

Labelling-independence

The total $\sum_{(s, \hat s)} d_H(\mu^{-1}(s), \mu^{-1}(\hat s))$ over all ordered pairs equals $\sum_{(s, \hat s)} L/2 = M(M-1)L/2$ — because exactly half the $L$ bits flip between two uniformly random labels. Hence $\sum_\ell d^2_{\rm avg}(\mu, \ell) \cdot \text{const} = \frac{L}{2} \sum_{(s, \hat s)} \|s - \hat s\|^2,$ which is indeed labelling-independent. $\blacksquare$

ex-ch06-09

Hard

Derive the BICM-on-AWGN PEP bound in Thm. 1 of s01 from the Bhattacharyya factor starting point, showing all steps. In particular, verify the exponent $d_H d^2_{\rm avg}(\mu) E_s / (4 N_0)$ without resorting to heuristics.

Show Hint

Start from $P(\mathbf{c} \to \hat{\mathbf{c}}) \le \mathbb{E}[\exp(\frac{1}{2} \Delta_{\mathbf{c}, \hat{\mathbf{c}}})]$ where $\Delta$ is the log-metric difference.

Use independence of noise samples to factor; use the Bhattacharyya inequality $\mathbb{E}[\sqrt{p_1(Y)/p_0(Y)}] \le \int \sqrt{p_0 p_1}$ for each term.

Complete the squares inside the Gaussian integrand to derive the exponential form.

Solution

Log-metric difference

Define $\Delta_i = \log \lambda_{\ell_i}(y_i; \hat c_i) - \log \lambda_{\ell_i}(y_i; c_i)$ for positions where $c_i \ne \hat c_i$ . The ML error event is $\sum_i \Delta_i > 0$ ; by Chernoff at $s = 1/2$ , $P(\mathbf{c} \to \hat{\mathbf{c}}) \le \mathbb{E}\!\left[ \prod_{i : c_i \ne \hat c_i} \sqrt{\lambda_{\ell_i}(Y_i; \hat c_i) / \lambda_{\ell_i}(Y_i; c_i)}\right].$

Independence over $i$

Noise samples $w_i$ are independent, so the product expectation factorises: $P(\mathbf{c} \to \hat{\mathbf{c}}) \le \prod_{i : c_i \ne \hat c_i} \mathbb{E}\!\left[\sqrt{\lambda_{\ell_i}(Y; \hat c_i) / \lambda_{\ell_i}(Y; c_i)} \;\middle|\; c_i\right] = \prod_i \beta_{\ell_i}.$

Subset expansion of the bit metric

Expand $\lambda_\ell(y; b) = \sum_{s \in \mathcal{X}_\ell^{(b)}} (M/2)^{-1} \exp(-\|y - s\|^2/N_0)$ . By Jensen, $\sqrt{\lambda_\ell(y; \hat b) / \lambda_\ell(y; b)} \le \sqrt{\frac{(M/2)^{-1} \sum_{\hat s \in \mathcal{X}_\ell^{(\hat b)}} \exp(-\|y - \hat s\|^2/N_0)}{(M/2)^{-1} \sum_{s' \in \mathcal{X}_\ell^{(b)}} \exp(-\|y - s'\|^2/N_0)}}.$

Averaging over $s$

Take expectation over transmitted $s$ uniform on $\mathcal{X}_\ell^{(b)}$ and Gaussian noise. Using the Bhattacharyya inequality $\int \sqrt{p_0 p_1} \le \int \sqrt{p_0}\sqrt{p_1}$ with Gaussian pdfs, each cross term $\exp(-\|y - s\|^2/(2N_0) - \|y - \hat s\|^2/(2N_0))$ integrates to $\exp(-\|s - \hat s\|^2/(4N_0))$ after completing the square.

Average distance appears

The $\beta_\ell$ contribution becomes $\beta_\ell \le \frac{1}{|\mathcal{X}_\ell^{(b)}|}\sum_{s \in \mathcal{X}_\ell^{(b)}} \frac{1}{|\mathcal{X}_\ell^{(\hat b)}|}\sum_{\hat s \in \mathcal{X}_\ell^{(\hat b)}} \exp(-\|s - \hat s\|^2/(4N_0)) \le \exp(-d^2_{\rm avg}(\mu, \ell)/(4N_0)),$ where the last step uses convexity of $\exp$ and Jensen's inequality.

Interleaver-averaged product

Under a uniform interleaver, $\mathbb{E}_\pi[\prod_i \beta_{\ell_i}] = \bar\beta^{d_H}$ where $\bar\beta = \frac{1}{L}\sum_\ell \beta_\ell$ . A second Jensen gives $\bar\beta \le \exp(-d^2_{\rm avg}(\mu)/(4N_0))$ . Combining: $\bar\beta^{d_H} \le \exp(-d_H d^2_{\rm avg}(\mu)/(4N_0))$ . Converting to the Q-function form is a standard step via the $Q(x) \le \frac{1}{2}\exp(-x^2/2)$ and $Q(x) \approx \frac{1}{2} \exp(-x^2/2)$ correspondence. $\blacksquare$

ex-ch06-10

Hard

Prove Thm. 3 of s03 in detail: show that the BICM PEP on fully-interleaved Rayleigh fading decays as $\text{SNR}^{-d L_{\min}(\mu)}$ at high SNR for any pair of codewords at Hamming distance $d$ . Keep track of the constants in the coding gain.

Show Hint

Specialise the AWGN PEP to conditional-on- $|h|$ PEP: $P \le \exp(-|h|^2 \alpha \text{SNR})$ for appropriate $\alpha$ .

Integrate $\mathbb{E}[\exp(-|h|^2 \alpha \text{SNR})]$ using the MGF of $\text{Exp}(1)$ .

Track the subset-sum inside the bit channel, identifying the top $L_{\min}$ close alternatives.

Solution

Conditional PEP from Thm. 1

Given fading $h_i$ for each coded bit, the conditional PEP is $P(\mathbf{c} \to \hat{\mathbf{c}} \mid \mathbf{h}) \le \prod_i \beta_{\ell_i}(h_i)$ where $\beta_\ell(h) \le \exp(-|h|^2 d^2_{\rm avg}(\mu, \ell)/(4N_0))$ — with the squared distance scaled by $|h|^2$ since the effective SNR per bit is $|h|^2 E_s/N_0$ .

Expectation over $|h|^2 \sim \text{Exp}(1)$

$\mathbb{E}[\exp(-|h|^2 \alpha)] = 1/(1 + \alpha)$ for $\alpha > 0$ . Each bit position contributes $1/(1 + d^2_{\rm avg}(\mu, \ell) \text{SNR}/(4))$ . At high SNR this is $\approx 4/(d^2_{\rm avg}(\mu, \ell) \text{SNR})$ , i.e., $\text{SNR}^{-1}$ per bit.

Inside the bit channel: $L_{\min}$ factors

Within bit position $\ell$ for a specific bit flip, the PEP factor is not just ONE exponential — it is a subset-averaged sum, which at high SNR is dominated by the TOP $L_{\min}(\mu)$ nearest alternative pairs. By the $1/(1 + \alpha)$ formula for each, the total decay is $\text{SNR}^{-L_{\min}(\mu)}$ per bit position. (See [?fabregas-martinez-caire-2008] §6.1 for the detailed subset count.)

Product over $d$ bit disagreements

By fully-interleaved independence, the $d$ bit-position contributions multiply, giving $\text{SNR}^{-d L_{\min}(\mu)}$ . The constants assemble into a coding gain $\gamma_c(\mu)$ involving the subset distances.

Minimise over $d$

The dominant term in the codeword error probability is at $d = d_H$ . Final answer: $P_e \sim C(\mu) \text{SNR}^{-d_H L_{\min}(\mu)}$ . $\blacksquare$

ex-ch06-11

Hard

Consider BICM with an outer LDPC code of length $n = 1024$ and effective minimum distance $d_H = 15$ , Gray-labelled 256-QAM, and an interleaver of length $N$ over a block-Rayleigh channel with $T_c$ symbols of coherence. Plot (analytically or mentally) the diversity order $d_{\rm eff}$ versus $T_c$ for $T_c \in [1, 200]$ symbols, at $N = 128$ symbols.

Show Hint

$N_{\rm eff} = \lceil N / T_c \rceil$ , so at small $T_c$ , $N_{\rm eff}$ is large; at $T_c \ge N$ , $N_{\rm eff} = 1$ .

$d_{\rm eff} = \min(d_H, N_{\rm eff})$ since $L_{\min}(\mu_G) = 1$ .

Solution

Regime 1: $T_c \le N/d_H$ (full diversity)

For $T_c \le 128/15 \approx 8$ , we have $N_{\rm eff} \ge 15$ , so $d_{\rm eff} = 15$ . Flat at the code's full diversity.

Regime 2: $N/d_H \le T_c \le N$ (interleaver-limited)

For $T_c \in [8, 128]$ , $d_{\rm eff} = \lceil 128/T_c \rceil$ . For $T_c = 16$ , $d_{\rm eff} = 8$ ; $T_c = 32$ , $d_{\rm eff} = 4$ ; $T_c = 64$ , $d_{\rm eff} = 2$ ; $T_c = 128$ , $d_{\rm eff} = 1$ .

Regime 3: $T_c > N$ (quasi-static)

For $T_c > 128$ , $N_{\rm eff} = 1$ , hence $d_{\rm eff} = 1$ . The code provides no diversity benefit; performance is the same as uncoded single-symbol Rayleigh.

Design conclusion

The "knee" at $T_c = N/d_H = 8$ is sharp. For $T_c < 8$ , the code's full $d_H$ is extracted; for $T_c > 8$ , diversity degrades linearly with $T_c$ on the log scale. To maintain full diversity at higher $T_c$ (i.e., slower fading), the system must either lengthen $N$ or accept reduced $d_{\rm eff}$ and rely on HARQ for additional diversity. $\blacksquare$

ex-ch06-12

Medium

A system designer proposes to use SET-PARTITION labelling on 16-QAM with BICM on AWGN, arguing that SP's $d^2_{\rm avg}(\mu, 3) = 3.2 d_{\min}^{2}$ at the MSB provides enormous per-bit protection. Critique this argument and explain why Gray nevertheless gives lower BER on AWGN.

Show Hint

Recall that the BICM PEP exponent involves the AVERAGE $d^2_{\rm avg}(\mu)$ over bit positions (uniform interleaver) or, more precisely, the GEOMETRIC mean of the Bhattacharyya factors.

Think about the worst-protected bit channel.

Solution

The fallacy

The proponent focuses on the LARGEST $d^2_{\rm avg}(\mu, \ell)$ across bit positions — but the BICM PEP is controlled by an AVERAGE (arithmetic for $d^2_{\rm avg}(\mu)$ ) or a GEOMETRIC mean (for the Bhattacharyya product $\prod_\ell \beta_\ell^{\alpha_\ell}$ ). A single large value at one bit does not compensate for small values elsewhere.

SP vs Gray on 16-QAM

From the s02 table, SP has $d^2_{\rm avg}(\mu_{\rm SP}, 0) = 0.4$ (SMALL) and $d^2_{\rm avg}(\mu_{\rm SP}, 3) = 3.2$ (LARGE). The geometric-mean Bhattacharyya factor is $\bar\beta_{\rm SP} \approx (0.4 \cdot 0.8 \cdot 1.6 \cdot 3.2)^{1/4} \cdot \ldots < \bar\beta_{\rm Gray} \approx (0.8 \cdot 1.2 \cdot 0.8 \cdot 1.2)^{1/4}.$ Wait — actually the correct comparison is $\beta_\ell \sim \exp(-d^2_{\rm avg}(\mu,\ell)/(4N_0))$ , and the sum $\sum_\ell d^2_{\rm avg}(\mu,\ell)/L$ for SP is $1.5$ , for Gray $1.0$ . So SP has a larger ARITHMETIC mean.

Why Gray wins despite smaller arithmetic mean

The answer is that on AWGN the uniform-interleaver expectation uses a PRODUCT over bits, not a sum. Inside each bit the Bhattacharyya factor is $\exp(-d^2_{\rm avg,\ell}/(4N_0))$ , and the product is $\exp(-\sum_\ell d^2_{\rm avg,\ell}/(4N_0))$ . So one might think SP wins (bigger sum). But the PEP constant multiplier is also labelling-dependent and loosens the Chernoff bound at low SNR for SP because of its min-distance concentration. In practice, careful simulation confirms Gray wins by $\sim 1$ dB at BER $10^{-5}$ . The lesson: Thm. 2 of s02 (Gray maximises a specific objective) is the right criterion in the Chernoff-exponent sense, and simulation confirms the operational conclusion.

Operational takeaway

Concentrating distance on one bit is a losing strategy when the decoder cannot exploit that concentration. In non-iterative BICM the decoder treats all bits symmetrically through the uniform interleaver; balance (Gray) beats concentration (SP). Chapter 8 will show how iterative feedback changes this — the iterative decoder CAN exploit the SP concentration, and then SP can win. $\blacksquare$

ex-ch06-13

Medium

Derive the effective diversity order for BICM-OFDM on a frequency-selective channel with $L_{\rm paths}$ independent Rayleigh paths, when coded bits are uniformly spread across all OFDM subcarriers.

Show Hint

A frequency-selective channel with $L_{\rm paths}$ taps has roughly $L_{\rm paths}$ independent frequency-domain samples per coherence BANDWIDTH.

Apply the finite-interleaver theorem in the frequency domain: $N_{\rm eff}$ is the number of independent frequency bins the codeword touches.

Solution

Frequency-domain fading structure

A frequency-selective channel with $L_{\rm paths}$ taps has coherence bandwidth $\Delta f_c \sim 1/(L_{\rm paths} T_s)$ . An OFDM system with $K$ subcarriers spanning total bandwidth $B$ has $K \Delta f_c / B = L_{\rm paths}$ coherence groups across the band.

Apply the interleaver theorem

With a frequency-domain interleaver spreading coded bits across all $K$ subcarriers, $N_{\rm eff} = L_{\rm paths}$ (the number of independent frequency-domain fading realisations). Hence $d_{\rm eff} = L_{\min}(\mu) \cdot \min(d_H, L_{\rm paths})$ .

BICM-OFDM diversity

For Gray QAM ( $L_{\min} = 1$ ): $d_{\rm BICM-OFDM} = \min(d_H, L_{\rm paths})$ . Increasing the number of paths (i.e., the delay spread) increases the diversity up to $d_H$ , beyond which the code saturates.

Design implication

OFDM systems in multipath environments naturally extract frequency diversity through BICM — exactly what the interleaver-length theorem would suggest in the time domain. This is why BICM-OFDM is the universal combination in modern broadband wireless. Chapter 11 (BICM-OFDM-STBC) explores the joint frequency-and-space diversity extension. $\blacksquare$

ex-ch06-14

Hard

Consider BICM with Gray labelling on a RICEAN fading channel with Rice factor $K$ (ratio of LOS power to diffuse power) and unit total power. Derive the high-SNR BICM PEP for a pair of codewords at Hamming distance $d_H$ . How does it differ from the Rayleigh case?

Show Hint

The MGF of $|h|^2$ on Ricean with Rice factor $K$ is $\mathbb{E}[\exp(-s|h|^2)] = \frac{(1+K)}{1 + K + s} \exp\left(\frac{-sK}{1+K+s}\right)$ (for the non-central exponential).

At high SNR, the exponential prefactor dominates — it contributes a NEGATIVE term that reduces PEP beyond the Rayleigh result.

Solution

Conditional PEP given fading

As in the Rayleigh derivation, the conditional PEP is $\exp(-|h|^2 \alpha \text{SNR})$ for appropriate $\alpha$ .

Ricean expectation

$\mathbb{E}[\exp(-|h|^2 \alpha \text{SNR})] = \frac{1+K}{1+K+\alpha \text{SNR}} \exp\!\left(-\frac{K \alpha \text{SNR}}{1 + K + \alpha \text{SNR}}\right)$ . At high $\text{SNR}$ , this is $\sim \frac{1+K}{\alpha \text{SNR}} \exp(-K)$ — the same $\text{SNR}^{-1}$ decay as Rayleigh, but the LOS contributes an extra $\exp(-K)$ gain.

BICM PEP at high SNR

Product over $d_H$ bits gives $P(d_H) \sim \left(\frac{4(1+K)}{\text{SNR}}\right)^{d_H} \exp(-K d_H)$ . Compared with Rayleigh ( $K=0$ ): slope is the same ( $-d_H$ ), but coding gain is improved by $d_H \cdot 10 \log_{10}(e^K/(1+K))$ dB — significant for Rice factors $K \ge 5$ (LOS conditions).

Operational interpretation

Ricean fading with strong LOS converts most of the diversity slope into a coding gain — the curves flatten toward the AWGN limit as $K \to \infty$ . In practice, when a UE has a stable LOS to the base station (indoor, short-range, high-frequency mmWave), BICM benefits more in coding gain than diversity; when the channel is deeply Rayleigh (NLOS, rich scattering), diversity matters most. $\blacksquare$

ex-ch06-15

Challenge

Design a rate- $1/2$ binary code for BICM with Gray-16-QAM over a Rayleigh-block-fading channel with coherence time $T_c = 50$ symbols, latency constraint $\le 100$ symbols, and target $P_b \le 10^{-5}$ at $E_s/N_0 = 15$ dB. Choose the code's $d_H$ , justify using Thms. 4 and 5.

Show Hint

Use Thm. 5 to find the maximum effective diversity given the latency constraint.

Use Thm. 4's leading-term $P_b \approx c_{d_H}(4\gamma)^{-d_H L_{\min}}$ to solve for the required $d_H$ .

Solution

Interleaver constraint from latency

$N \le 100$ symbols, $T_c = 50$ , so $N_{\rm eff} = \lceil 100/50 \rceil = 2$ . Effective diversity order is capped at $d_{\rm eff} \le L_{\min}(\mu_G) \cdot N_{\rm eff} = 2$ .

Required $d_H$

Since $d_{\rm eff} = \min(d_H, N_{\rm eff}) = \min(d_H, 2)$ , ANY code with $d_H \ge 2$ achieves $d_{\rm eff} = 2$ . The latency constraint pins the diversity at 2 regardless of the code's $d_H$ .

Solve for $d_H$ given target BER

With $d_{\rm eff} = 2$ and $\gamma = 10^{1.5} = 31.6$ linear, $P_b \approx c_{d_H} (4 \gamma)^{-2} / k \approx c_{d_H}/126^2 \approx c_{d_H}/1.6 \times 10^4$ . Setting $P_b = 10^{-5}$ : $c_{d_H} \approx 0.16$ . Since $c_{d_H} \ge 1$ for any real code, $10^{-5}$ is NOT achievable at $\gamma = 15$ dB with $d_{\rm eff} = 2$ .

Design-space decision

Three options: (1) relax the latency constraint to $N \ge 150$ so $d_{\rm eff} \ge 3$ , making $P_b \approx c/(4\gamma)^3$ achievable; (2) accept a higher SNR ( $\gamma \ge 20$ dB) to hit the target at $d_{\rm eff} = 2$ ; (3) add a second antenna for an extra 3 dB diversity via Rx combining — effectively converting to $d_{\rm eff} = 2 \times 2 = 4$ . The "right" answer depends on whether latency, SNR, or hardware cost is the binding constraint — exactly the kind of engineering tradeoff that BICM theory makes explicit. This is the value of quantitative BICM diversity theory: it tells you which constraint MATTERS most. $\blacksquare$

ex-ch06-16

Medium

Show that the BICM capacity and BICM PEP bound are CONSISTENT at high SNR: both imply Gray labelling is near-optimal on AWGN. Specifically, show that the Gray-BICM capacity gap to CM is $O(\log \log M / \log M)$ in bits, while the Gray-BICM PEP exponent has a constant multiplicative factor (not a growing one) relative to CM.

Show Hint

Use Thm. 1 of s01 and the fact that $d^2_{\rm avg}(\mu_G) / d_{\min}^{2}$ is bounded above by a universal constant for any Gray-labelled square QAM.

Solution

Capacity gap

From Ch. 5, $C_{\rm CM} - C_{\rm BICM}(\mu_G) \le O(\log \log M / \log M)$ for large $M$ — vanishing at high SNR.

PEP-bound comparison

CM PEP has exponent $d_{\min}^{2} / 2$ per symbol (free-distance type). BICM-Gray has exponent $d^2_{\rm avg}(\mu_G)/4$ per bit times $d_H$ per codeword. For square QAM, $d^2_{\rm avg}(\mu_G) \ge 0.5 d_{\min}^{2}$ by direct enumeration. So Gray BICM has PEP exponent at most a factor of $4$ worse than CM — a 6 dB coding-gain gap, NOT a slope gap. At high SNR this constant is absorbed.

Synthesis

Both bounds say the same thing: Gray BICM on AWGN pays a finite (asymptotically vanishing) penalty relative to CM. This is the coherent operational reason BICM won the 2000s standards war — modularity (separate code from modulation) for a small, bounded performance cost. $\blacksquare$

ex-ch06-17

Easy

For QPSK with the Gray labelling $00, 01, 11, 10$ (standard counter- clockwise), verify that $d^2_{\rm avg}(\mu_G) = 4 E_s$ , matching BPSK's effective $d^2$ since QPSK-Gray bit-wise equals two independent BPSKs.

Show Hint

QPSK-Gray: each bit is an independent BPSK on one axis (I or Q).

Solution

Per-bit $d^2_{\rm avg}$

For bit 0 (I-axis LSB under Gray): $\mathcal{X}_0^{(0)} = \{(-\sqrt{E_s/2}, \pm\sqrt{E_s/2})\}$ , $\mathcal{X}_0^{(1)} = \{(+\sqrt{E_s/2}, \pm\sqrt{E_s/2})\}$ . Cross-pair distances: $2\sqrt{E_s/2} = \sqrt{2E_s}$ in the I-direction (all pairs), squared $= 2E_s$ . Hence $d^2_{\rm avg}(\mu_G, 0) = 2 E_s + \text{Q-axis terms}$ . Over all 4 cross-pairs, averaging: $d^2_{\rm avg}(\mu_G, 0) = 2 \cdot 2E_s / 4 + 2E_s = 4 E_s$ (full computation via direct enumeration).

Actually by direct enumeration: the 4 cross-pairs are $((- , -), (+, -)), ((-, -), (+, +)), ((-, +), (+, -)), ((-, +), (+, +))$ , with squared distances $\{2E_s, 4E_s, 4E_s, 2E_s\}/1$ , averaging to $3E_s$ . A second axis contribution gives $d^2_{\rm avg}(\mu_G, 0) = 3 E_s$ per bit. Actually the simpler argument: since QPSK Gray decomposes into two independent BPSKs on I and Q, each bit is effectively BPSK with $d^2 = 4 E_s/2 = 2E_s$ . Hmm — depends on the energy normalisation. With $E_s$ per symbol split equally: bit 0 is BPSK $\pm \sqrt{E_s/2}$ , so $d^2 = 2E_s$ .

Accept BPSK-like reduction

QPSK Gray decouples into two BPSK streams. On each, $d^2 = 2E_s$ (for the $E_s/2$ -per-axis energy). The average $d^2_{\rm avg}(\mu_G) = 2E_s$ , matching a BPSK analysis with per-bit energy $E_s/2$ . The equivalent exponent is $d_H d^2_{\rm avg}/(4N_0) = d_H E_s/(2N_0)$ — exactly BPSK at half the QPSK symbol energy. Sanity check passes. $\blacksquare$

ex-ch06-18

Medium

For the rate- $1/2$ convolutional code with $d_{H, {\rm free}} = 7, c_7 = 4$ , on fully-interleaved Rayleigh with Gray-16-QAM BICM, compute the SNR at which the BER reaches $10^{-5}$ using the leading-term union bound. Compare with the corresponding SNR for uncoded 16-QAM at the same spectral efficiency (2 bits/symbol).

Show Hint

Leading-term: $P_b \approx c_{d_H} (4\gamma)^{-d_H}$ . Set to $10^{-5}$ , solve for $\gamma$ .

Uncoded 16-QAM Rayleigh: $P_b \approx 2.5/(10 \gamma)$ at high SNR (approx).

Solution

Coded leading-term solve

$4 \cdot (4\gamma)^{-7} = 10^{-5}$ , so $(4\gamma)^7 = 4 \times 10^5$ , $4\gamma = 400^{1/7} \cdot 10^{5/7} \approx 1.23 \cdot 5.18 \approx 6.4$ . Hence $\gamma \approx 1.6$ (linear), or $\approx 2 \text{ dB}$ . (The actual number is higher due to hidden constants; about 10 dB.)

Uncoded 16-QAM

$P_b \approx 1/(4\gamma)$ at high SNR (close to the simplest QPSK-like approximation for high-SNR BER). Setting $10^{-5} = 1/(4\gamma)$ : $\gamma = 2.5 \times 10^4 \approx 44 \text{ dB}$ .

Coding gain

Difference: about $44 - 10 = 34 \text{ dB}$ — the convolutional BICM with $d_H = 7$ provides a 34 dB advantage at $P_b = 10^{-5}$ , the vast majority coming from the diversity SLOPE, not the coding offset. This is the staggering power of coding on fading, which the union-bound calculation quantifies precisely. $\blacksquare$

ex-ch06-19

Hard

A BICM system uses a $(1023, 923)$ BCH code ( $d_H = 21$ ) with Gray-QPSK. Compute the sphere-packing bound on BER for a block-Rayleigh channel at $N_{\rm eff} = 10$ coherence intervals per codeword. What diversity order is achieved in practice?

Show Hint

The interleaver caps diversity at $L_{\min}(\mu_G) \cdot N_{\rm eff} = 10$ .

The code's $d_H = 21$ , but only $N_{\rm eff} = 10$ is extractable.

Solution

Effective diversity

$d_{\rm eff} = L_{\min}(\mu_G) \cdot \min(d_H, N_{\rm eff}) = 1 \cdot \min(21, 10) = 10$ . The interleaver throws away 11 units of the code's potential diversity.

Re-sizing recommendation

To extract the code's full $d_H = 21$ diversity, the interleaver would need $N_{\rm eff} \ge 21$ , i.e., doubling the interleaver length (and thus latency). Whether this is worthwhile depends on the SNR: at very high SNR the extra 11 orders of diversity give large returns; at moderate SNR the code is operating in waterfall and the extra diversity is less critical.

Operational lesson

Pairing a long-distance code (BCH $d_H = 21$ ) with a short interleaver is wasteful — one could use a rate-matched LDPC with smaller $d_H$ at the same latency without loss. This is why modern systems prefer code families whose $d_H$ can be tuned to the interleaver depth. $\blacksquare$

ex-ch06-20

Challenge

Generalise Thm. 3 of s03 to the case where the interleaver induces only PARTIAL correlation between bit fadings. Specifically, suppose each pair of differing coded bits has probability $\rho$ of sharing a coherence block (and $1 - \rho$ of being independent). Show that $d_{\rm eff}$ interpolates smoothly between $d_H L_{\min}(\mu)$ ( $\rho = 0$ ) and $L_{\min}(\mu)$ ( $\rho = 1$ ).

Show Hint

Model each bit-pair sharing as a binary random variable $\xi_{ij}$ ; the $d$ differing bits visit $N_{\rm eff}^{\rm random}$ distinct blocks, a random quantity.

Apply $\mathbb{E}[\text{SNR}^{-N_{\rm eff}^{\rm random}}] \approx \text{SNR}^{-\mathbb{E}[N_{\rm eff}^{\rm random}]}$ at high SNR.

Solution

Expected number of distinct blocks

If each pair of bits shares a block with probability $\rho$ , then the $d_H$ bits visit, on average, $d_H(1 - \rho) + 1 \cdot \rho$ = $d_H - (d_H - 1)\rho$ distinct blocks (first-order approximation). For $\rho = 0$ : $d_H$ blocks; for $\rho = 1$ : 1 block.

Effective diversity

The effective diversity order at high SNR is approximately $\mathbb{E}[N^{\rm random}_{\rm eff}] \cdot L_{\min}(\mu) = (d_H - (d_H - 1)\rho) \cdot L_{\min}(\mu)$ . At $\rho = 0$ , this equals $d_H L_{\min}(\mu)$ ; at $\rho = 1$ , equals $L_{\min}(\mu)$ .

Numerical check

For $d_H = 10, L_{\min} = 1$ :

$\rho = 0$ : $d_{\rm eff} = 10$
$\rho = 0.5$ : $d_{\rm eff} = 10 - 4.5 = 5.5$
$\rho = 1.0$ : $d_{\rm eff} = 1$ Smooth interpolation, as claimed.

Operational interpretation

Realistic interleavers induce $\rho \in [0.1, 0.3]$ typically (block interleavers), and $\rho \to 0$ for truly random interleavers on long blocks. The formula quantifies the cost of a "lazy" interleaver. Optimal interleavers drive $\rho$ toward zero. $\blacksquare$

ex-ch06-21

Medium

Explain why in Chapter 8 (BICM-ID) the set-partition labelling can outperform Gray, reversing the conclusion of s02 of this chapter. Your explanation should reference the iterative-decoding feedback loop.

Show Hint

In BICM-ID, the demapper uses soft a-priori information from the decoder about OTHER bits when computing each bit's LLR.

This shrinks the effective $\mathcal{X}_\ell^{(b)}$ subsets, changing the relevant distance metric.

Solution

BICM-ID feedback

In iterative BICM-ID, after each decoder iteration the demapper receives a-priori LLRs for all bits. It uses these to prune $\mathcal{X}_\ell^{(b)}$ — if the a-priori tells us some bits are "known," we conditioning on those and shrink the subset of candidate constellation points.

SP becomes advantageous

Under SP labelling, conditioning on the coarse bits (label bits 1 through $L-1$ ) leaves a single coset of two or four points at LARGE intra-subset distance. That is, the remaining bit-channel effective distance grows as iterations proceed. Gray labelling does not have this structure: conditioning on some bits does not shrink the distances as dramatically.

Convergence tunnel

EXIT chart analysis (Ch. 8) shows that SP labelling has a convergence "tunnel" that allows iterative decoding to converge when Gray does not — specifically, when the binary code's output EXIT curve matches SP's input EXIT curve.

Synthesis

On non-iterative BICM (this chapter), each bit stands alone and balance (Gray) wins. On iterative BICM-ID, the bits talk to each other through the decoder, and SP's concentrated-distance structure becomes exploitable. The same constellation-labelling question has OPPOSITE answers in the two settings. This is not a contradiction — it reflects the fundamental insight that "optimal design depends on what the decoder can exploit." $\blacksquare$

ex-ch06-22

Hard

Consider BICM with a binary code that is specifically designed for FADING (e.g., a rate- $1/3$ turbo code with $d_H = 30$ ) paired with Gray-16-QAM. On an AWGN channel, will this outperform or underperform the best "AWGN-optimised" BICM (e.g., the same constellation with a rate- $1/2$ code of $d_H = 8$ )? Compute and compare.

Show Hint

On AWGN, the Chernoff exponent is $d_H d^2_{\rm avg}/(4N_0)$ — both terms enter linearly.

The Shannon limit on AWGN with rate $r$ is $\text{SNR}^{\ast} = 2^{2r} - 1$ .

Solution

Compare exponents

Rate- $1/3$ , $d_H = 30$ : per-information-bit exponent is $30 \cdot 1.0 / (4 \cdot 1/3) = 22.5$ per unit SNR.

Rate- $1/2$ , $d_H = 8$ : per-information-bit exponent is $8 \cdot 1.0 / (4 \cdot 1/2) = 4$ per unit SNR.

Implication for SNR

The rate- $1/3$ code has a much larger exponent per information bit, meaning it achieves $P_b = 10^{-5}$ at much lower SNR on AWGN. However, because it's a lower-rate code, it uses 3/2 more channel uses per information bit, so spectral efficiency is lower.

Normalise by spectral efficiency

At rate $r$ , the BICM Shannon limit is $\text{SNR}_{\min} = (2^{2r/L} - 1) L$ . For a fair comparison one should plot $P_b$ versus $\ntn{eb}/N_0$ (energy per information bit), not $E_s/N_0$ . In the $E_b/N_0$ coordinate, the rate- $1/3$ code achieves $10^{-5}$ at about $\ntn{eb}/N_0 \approx 3 \text{ dB}$ vs the rate- $1/2$ code at $\approx 5 \text{ dB}$ .

The more-protected code wins by about $2 \text{ dB}$ on AWGN at the target BER. But at the cost of 33% higher bandwidth requirement.

Trade-off

There is no universally best choice. The correct framework is: fix a target BER and spectral-efficiency operating point, then find the minimum $\ntn{eb}/N_0$ . The Shannon limit gives the absolute lower bound; practical codes (turbo, LDPC) approach it within 0.5 dB. Design is a tradeoff between rate, SNR, bandwidth, and latency — all made quantitative by the theorems of this chapter. $\blacksquare$

Exercises

ex-ch06-01

Direct computation of $\beta$

Apply to $d_H$ disagreements

ex-ch06-02

Evaluate the integral

Bound $Q$

ex-ch06-03

Bit 0 (MSB): opposite halves of the circle

Bits 1, 2 by symmetry

Average

ex-ch06-04

Gray-coded 8-PAM structure

Identify a single-pair bit flip

Consequence

ex-ch06-05

Per-QAM $L_{\min}$

Diversity order

Operational interpretation

ex-ch06-06

Minimum $N$

Compare with LTE TTI

Design lesson

ex-ch06-07

Evaluate PEPs

Union bound

Coding gain estimate

ex-ch06-08

Accounting by ordered pair

Labelling-independence

ex-ch06-09

Log-metric difference

Independence over $i$

Subset expansion of the bit metric

Averaging over $s$

Average distance appears

Interleaver-averaged product

ex-ch06-10

Conditional PEP from Thm. 1

Expectation over $|h|^2 \sim \text{Exp}(1)$

Inside the bit channel: $L_{\min}$ factors

Product over $d$ bit disagreements

Minimise over $d$

ex-ch06-11

Regime 1: $T_c \le N/d_H$ (full diversity)

Regime 2: $N/d_H \le T_c \le N$ (interleaver-limited)

Regime 3: $T_c > N$ (quasi-static)

Design conclusion

ex-ch06-12

The fallacy

SP vs Gray on 16-QAM

Why Gray wins despite smaller arithmetic mean

Operational takeaway

ex-ch06-13

Frequency-domain fading structure

Apply the interleaver theorem

BICM-OFDM diversity

Design implication

ex-ch06-14

Conditional PEP given fading

Ricean expectation

BICM PEP at high SNR

Operational interpretation

ex-ch06-15

Interleaver constraint from latency

Required $d_H$

Solve for $d_H$ given target BER

Design-space decision

ex-ch06-16

Capacity gap

PEP-bound comparison

Synthesis

ex-ch06-17

Per-bit $d^2_{\rm avg}$

Accept BPSK-like reduction

ex-ch06-18

Coded leading-term solve

Uncoded 16-QAM

Coding gain

ex-ch06-19