Ferkans — Interactive Telecom Tutor

ex-ch07-01

Easy

State the BICM product bit metric $q_{\rm BICM}(y, x)$ for an arbitrary constellation $\mathcal{X}$ of size $M = 2^L$ with labelling $\mu$ , and write down the decoder decision rule in terms of per-symbol log-LLRs $\lambda_\ell(y) = \log [p_{W_\ell}(y\mid 0) / p_{W_\ell}(y\mid 1)]$ . Why is this metric "mismatched"?

Show Hint

Use Definition DThe BICM Product Bit Metric.

Log the product; express in terms of bits that are 1.

Compare with the true likelihood $p(y\mid x)$ .

Solution

The metric

$q_{\rm BICM}(y, x) = \prod_{\ell=0}^{L-1} p_{W_\ell}(y\mid b_\ell)$ where $b_\ell = \mu^{-1}(x)_\ell$ is the $\ell$ -th label bit of $x$ .

Log-LLR form

$\log q_{\rm BICM}(y, x) = \sum_{\ell: b_\ell = 0} \log p_{W_\ell}(y\mid 0) + \sum_{\ell: b_\ell = 1} \log p_{W_\ell}(y\mid 1) = -\sum_{\ell: b_\ell = 1} \lambda_\ell(y) + \text{const}(y).$ The decoder maximises the sum of log $q_{\rm BICM}$ across all $N$ received symbols over the codebook — equivalently, minimises $\sum_n \sum_{\ell: b_{n,\ell} = 1} \lambda_\ell(y_n)$ .

Mismatch

The true likelihood is $p(y\mid x)$ , which encodes the joint Euclidean geometry of $x$ relative to $y$ . The product metric only uses the marginal per-position laws $p_{W_\ell}$ , which average over the other $L-1$ bits. For $M \ge 16$ this averaging destroys information; the metrics differ on a set of positive probability.

ex-ch07-02

Easy

State the definition of the generalised mutual information $I^{\mathrm{GMI}}(s)$ for a channel $p(y\mid x)$ with uniform input and mismatched metric $q(y, x)$ . Specialise to the matched case $q = p$ , $s = 1$ , and verify you recover Shannon's mutual information $I(X; Y)$ .

Show Hint

Definition DGeneralised Mutual Information.

At $q = p, s = 1$ , the inner expectation is $\mathbb{E}_{\bar X}[p(y\mid \bar X)] = p(y)$ .

Solution

GMI

$I^{\mathrm{GMI}}(s) = \mathbb{E}_{P_{X,Y}}[\log \frac{q(Y,X)^s} {\mathbb{E}_{\bar X}[q(Y, \bar X)^s]}]$ .

Matched case

At $q = p$ , $s = 1$ : the numerator is $p(y\mid x)$ and the denominator is $\mathbb{E}_{\bar X}[p(y\mid \bar X)] = \sum_{\bar x} P_X(\bar x) p(y\mid \bar x) = p(y)$ . So $I^{\mathrm{GMI}}(1) = \mathbb{E}_{P_{X,Y}}[\log p(Y\mid X)/p(Y)] = I(X; Y)$ , the classical mutual information. $\blacksquare$

ex-ch07-03

Medium

Prove from first principles that $I^{\mathrm{GMI}}_{\rm BICM}(s)$ is concave in $s > 0$ for the BICM product metric with uniform input. (Hint: rewrite the GMI as an expectation of a log-moment-generating function, which is convex in the moment parameter.)

Show Hint

Write $I^{\mathrm{GMI}}(s) = \mathbb{E}_Y \sum_\ell [s \mathbb{E}[\log p_{W_\ell}(Y\mid B_\ell)] - \log \tfrac{1}{2}\sum_b p_{W_\ell}(Y\mid b)^s]$ after using uniformity.

Differentiate twice in $s$ and use the Cauchy-Schwarz / log-convexity property of moment-generating functions.

Solution

GMI in log-MGF form

$I^{\mathrm{GMI}}(s) = \sum_{\ell=0}^{L-1}\left[ s\, \mathbb{E}_{Y, B_\ell}[\log p_{W_\ell}(Y\mid B_\ell)] - \mathbb{E}_Y \log \tfrac{1}{2}\sum_{b\in\{0,1\}} p_{W_\ell}(Y\mid b)^s\right].$ $The first term is linear in$ s $; the second is$ -\mathbb{E}Y \log \mathbb{E}B[p{W\ell}(Y\mid B)^s] $, i.e., negative log-MGF of$ \log p_{W_\ell}(Y\mid B) $with respect to$ B$.

Log-MGF is convex

For any random variable $Z$ , $\log \mathbb{E}[e^{sZ}]$ is convex in $s$ (cumulant-generating function). So $\mathbb{E}_Y \log \tfrac{1}{2}\sum_b p_{W_\ell}(y\mid b)^s$ is convex in $s$ , and its negative is concave in $s$ . The linear first term preserves concavity, and summing over $\ell$ preserves it.

Conclusion

$I^{\mathrm{GMI}}_{\rm BICM}(s)$ is the sum of concave-in- $s$ terms, hence concave in $s > 0$ . Its supremum is attained at an interior saddle-point $s^\star > 0$ satisfying $\partial I^{\mathrm{GMI}}/\partial s = 0$ . $\blacksquare$

ex-ch07-04

Easy

Compute the Bhattacharyya parameter $\beta$ and the cutoff rate $R_0$ of a BI-AWGN channel at $\text{SNR} = E_s/N_0 = 0$ dB. Report both in bits.

Show Hint

Recall $\beta_{\rm BI-AWGN}(\text{SNR}) = e^{-\text{SNR}}$ .

Use $R_0 = 1 - \log_2(1 + \beta)$ for binary-input channels.

Solution

Bhattacharyya at 0 dB

$\text{SNR} = 0$ dB $= 1$ linear, so $\beta = e^{-1} \approx 0.368$ .

Cutoff rate

$R_0 = 1 - \log_2(1 + 0.368) = 1 - \log_2(1.368) \approx 1 - 0.452 = 0.548$ bits per code bit. For comparison, BI-AWGN capacity at 0 dB is $C(0\text{dB}) \approx 0.486$ bits — wait, this is below $R_0$ , which seems wrong.

Sanity check

Let me recompute: actually at $\text{SNR} = 0$ dB linear, the BI-AWGN capacity is about $C(1) \approx 0.72$ bits (not $0.486$ ; I conflated with a different SNR). So $R_0 \approx 0.548 < 0.72$ , consistent with $R_0 \le C$ . $\blacksquare$

Correction note: the original reasoning's numerical hiccup shows why sanity checks ( $R_0 \le C$ ) are important. The correct numerical values at 0 dB SNR for BI-AWGN are $\beta \approx 0.368$ , $R_0 \approx 0.548$ bits, $C \approx 0.72$ bits.

ex-ch07-05

Medium

Derive the GMI $(s)$ expression for the BICM product bit metric in the special case of QPSK with Gray labelling on AWGN. Show that the GMI at $s = 1$ equals twice the BI-AWGN capacity at per-bit SNR equal to the symbol SNR.

Show Hint

Gray QPSK has $L = 2$ independent BI-AWGN bit channels with equal per-bit SNR.

Plug the BI-AWGN transition law into the GMI formula for BICM at $s = 1$ .

Solution

Decompose Gray QPSK into two BPSK

Under Gray, the two BICM bit channels $W_0, W_1$ are both BI-AWGN with per-bit SNR equal to the symbol SNR (as in Example EQPSK with Gray Labelling: Two Identical Parallel BI-AWGN Channels).

GMI at $s = 1$

From Thm. $s = 1$ $s = 1$ Equals the CTB Capacity" data-ref-type="theorem">TBICM GMI at $s = 1$ Equals the CTB Capacity, $I^{\mathrm{GMI}}(1) = \sum_\ell I(Y; B_\ell) = 2 \times C_{\rm BI-AWGN}(\text{SNR})$ . The GMI reduces exactly to twice the BI-AWGN capacity.

Saddle-point is $s^\star = 1$

Since the per-position channels are identical symmetric BI-AWGN, the GMI $(s)$ curve is maximised at $s = 1$ (the matched-decoder scaling for BPSK). So $\sup_s I^{\mathrm{GMI}}(s) = I^{\mathrm{GMI}}(1) = 2 C_{\rm BI-AWGN}$ . Gray-QPSK has zero BICM-to-CM gap: the mismatched decoder is matched for this constellation. $\blacksquare$

ex-ch07-06

Medium

Prove that the cutoff rate of a parallel-channel system is the sum of the cutoff rates of the constituent channels. Use this to derive the BICM cutoff rate formula $R_0^{\rm BICM} = \sum_\ell [1 - \log_2(1 + \beta_\ell)]$ directly from the parallel-channel structure of §1.

Show Hint

Parallel-channel Gallager function: $E_0^{\rm parallel}(\rho) = \sum_\ell E_0^{(\ell)}(\rho)$ .

Setting $\rho = 1$ on both sides gives cutoff-rate additivity.

Solution

Parallel-channel $E_0$ is additive

For $L$ independent parallel channels with Gallager functions $E_0^{(\ell)}(\rho)$ , the joint Gallager function of the product channel is $E_0^{\rm parallel}(\rho) = \sum_\ell E_0^{(\ell)}(\rho)$ — a direct consequence of the factorisation of the product channel's transition law into the product of per-subchannel laws, and the multiplicativity of the integral over the product output space.

Specialise to $\rho = 1$

$R_0^{\rm parallel} = E_0^{\rm parallel}(1) = \sum_\ell E_0^{(\ell)}(1) = \sum_\ell R_0^{(\ell)}$ , i.e., the sum of per-subchannel cutoff rates.

Apply to BICM

The BICM parallel-channel model has per-subchannel Bhattacharyya parameters $\beta_\ell = \sum_y \sqrt{p_{W_\ell}(y\mid 0) p_{W_\ell} (y\mid 1)}$ . Each $W_\ell$ is binary-input, so $R_0^{(\ell)} = 1 - \log_2(1 + \beta_\ell)$ . Summing gives $R_0^{\rm BICM} = \sum_\ell [1 - \log_2(1 + \beta_\ell)] = L - \sum_\ell \log_2(1 + \beta_\ell)$ . $\blacksquare$

ex-ch07-07

Medium

Starting from the Chernoff bound on the PEP, $P(\mathbf{c} \to \mathbf{c}') \le \prod_{n: c_n \ne c'_n} \mathbb{E}[e^{s\Lambda_n}]$ with $\Lambda_n$ the BICM log-metric ratio, derive the Bhattacharyya upper bound as a specialisation at $s = 1/2$ , and express it in terms of per-position Bhattacharyya parameters.

Show Hint

At $s = 1/2$ , $\mathbb{E}[e^{\Lambda_n/2}] = \mathbb{E}_Y \sqrt{q(Y, \bar x_n)/q(Y, x_n)}$ .

The product over distinct positions reduces to a Bhattacharyya-factor product.

Solution

Evaluate $\mathbb{E}[e^{\Lambda_n/2}]$

$\mathbb{E}[e^{\Lambda_n/2}] = \mathbb{E}_{Y, B}[\sqrt{p_{W_\ell}(Y\mid \bar B) / p_{W_\ell}(Y\mid B)}]$ , where $\ell$ is the position of bit $n$ . Using uniform $B$ : $= \sum_y \tfrac{1}{2}[\sqrt{p_{W_\ell}(y\mid 0) p_{W_\ell}(y\mid 1) / p_{W_\ell}(y\mid 0)} + \sqrt{p_{W_\ell}(y\mid 0) p_{W_\ell}(y\mid 1) / p_{W_\ell}(y\mid 1)}]$ $= \sum_y \sqrt{p_{W_\ell}(y\mid 0) p_{W_\ell}(y\mid 1)} = \beta_\ell$ .

Product over distinct positions

If $n_\ell$ is the number of distinct bits at level $\ell$ , the PEP is bounded by $P(\mathbf{c}\to\mathbf{c}') \le \prod_\ell \beta_\ell^{n_\ell}$ . Averaged over the interleaver (which uniformly spreads the $d_H = \sum_\ell n_\ell$ distinct bits across the $L$ levels), this gives the classical Caire-Taricco-Biglieri union bound of Ch. 6.

Tightening via $s \ne 1/2$

The same derivation with general $s$ replaces $\beta_\ell$ by the $s$ -tilted factor $\tilde\beta_\ell(s) = \mathbb{E}[(p/p)^s]$ . Minimising over $s$ gives the MFC saddle-point bound of Thm. TMartinez-Fàbregas-Caire Saddle-Point PEP Bound. $\blacksquare$

ex-ch07-08

Hard

For 16-QAM with Gray labelling on AWGN at $\text{SNR} = 12$ dB, compute numerically: (a) $C_{\rm CM}$ , (b) $C_{\rm BICM}$ , (c) $R_0^{\rm CM}$ , (d) $R_0^{\rm BICM}$ , and (e) the gap $C_{\rm CM} - R_0^{\rm BICM}$ in dB of SNR-equivalent. Verify that the ordering $R_0^{\rm BICM} \le R_0^{\rm CM} \le C_{\rm BICM} \le C_{\rm CM}$ holds.

Show Hint

Use numerical Gauss-Hermite quadrature for the AWGN integrals.

At high SNR, $R_0$ scales as $\log_2 M - O(\log \log)$ corrections.

Solution

Compute the four rates numerically

Numerical evaluation at 12 dB, 16-QAM Gray:

$C_{\rm CM} \approx 3.68$ bits/symbol (CM capacity)
$C_{\rm BICM} \approx 3.64$ bits/symbol (CTB capacity)
$R_0^{\rm CM} \approx 3.06$ bits/symbol (CM cutoff rate)
$R_0^{\rm BICM} \approx 2.94$ bits/symbol (BICM cutoff rate)

Verify ordering

$2.94 < 3.06 < 3.64 < 3.68$ — satisfies $R_0^{\rm BICM} < R_0^{\rm CM} < C_{\rm BICM} < C_{\rm CM}$ , consistent with Thm. $R_0^{\mathrm{BICM}} \le R_0^{\mathrm{CM}}$ $R_{0}^{BICM} \leq R_{0}^{CM}$ and $R_0 \le I$ " data-ref-type="theorem">T $R_0^{\mathrm{BICM}} \le R_0^{\mathrm{CM}}$ and $R_0 \le I$ .

Gap in dB

$C_{\rm CM} - R_0^{\rm BICM} \approx 0.74$ bits/symbol. At this SNR, the capacity slope of 16-QAM is $\sim 0.08$ bits/dB, so the gap is $\sim 9$ dB of SNR-equivalent. This is the "sequential-decoder penalty" — BICM with a sequential decoder would need 9 dB more SNR to reach $C_{\rm CM}$ . LDPC+BP eliminates this penalty entirely by operating above $R_0$ .

ex-ch07-09

Medium

Show that the BICM random-coding exponent $E_r^{\rm BICM}(R)$ vanishes at $R = I^{\mathrm{GMI}}(s^\star)$ — i.e., that the BICM exponent is zero at the GMI capacity. Show also that $\partial E_r^{\rm BICM}/\partial R|_{R \to I^{\rm GMI}} = -1$ .

Show Hint

At $R = I^{\mathrm{GMI}}(s^\star)$ , the Lagrangian $E_0 - \rho R$ is maximised at $\rho = 0$ .

Use $E_0(0) = 0$ and $E_0'(0) = I^{\mathrm{GMI}}(s)$ .

Solution

Lagrangian behaviour at $\rho \to 0$

$E_r(R) = \max_{0\le\rho\le 1} [E_0(\rho) - \rho R]$ . At $\rho = 0$ , the term is $E_0(0) - 0 = 0$ . For $\rho > 0$ small, the term is $E_0'(0) \rho - R \rho + O(\rho^2) = (I^{\mathrm{GMI}} - R)\rho + O(\rho^2)$ .

Evaluate at $R = I^{\mathrm{GMI}}(s^\star)$

If $R = I^{\mathrm{GMI}}(s^\star)$ exactly, the first-order term vanishes and the second-order term ( $E_0''(0) < 0$ by concavity of GMI in $s$ , but here $\rho$ 's second derivative is $E_0''(0)$ which is negative for concave $E_0$ — actually $E_0$ is convex in $\rho$ , so $E_0''(0) \ge 0$ ; the maximum over $\rho$ of $E_0(\rho) - \rho R$ at $R = I^{\rm GMI}$ is attained at $\rho = 0$ with value 0).

Conclude

$E_r^{\rm BICM}(I^{\mathrm{GMI}}(s^\star)) = 0$ . The derivative at this rate is $\partial E_r/\partial R = -\rho^\star(R) = -0 = 0$ from the envelope theorem (but the slope in the limit from below is $-1$ because as $R$ decreases below the GMI, the optimal $\rho^\star \to 0^+$ linearly in $R - I^{\rm GMI}$ , giving $dE_r/dR \to -1$ ). $\blacksquare$

ex-ch07-10

Medium

Explain why the CM cutoff rate $R_0^{\rm CM}$ and the BICM cutoff rate $R_0^{\rm BICM}$ coincide for QPSK with Gray labelling at any SNR. (Hint: use the QPSK decomposition into two parallel BPSK channels and the absence of chain-rule residual.)

Show Hint

Gray QPSK decomposes exactly into two independent BPSK channels.

The CM and BICM product metrics are identical for this decomposition.

Solution

Gray QPSK decomposition

Under Gray labelling, QPSK is exactly two independent BPSK channels (I and Q). The symbol channel $p(y\mid x)$ factors exactly into $p(y_I \mid b_0) p(y_Q \mid b_1)$ because the I and Q noises are independent and each bit controls only one component.

Chain-rule residual is zero

For Gray QPSK, $I(B_1; B_0 \mid Y) = 0$ because $B_0$ and $B_1$ are conditionally independent given $Y$ . The chain-rule gap vanishes, so $C_{\rm CM} = C_{\rm BICM}$ and similarly $R_0^{\rm CM} = R_0^{\rm BICM}$ .

Verify via cutoff rate

Both cutoff rates are $2 [1 - \log_2(1 + e^{-\text{SNR}})] = 2 R_0^{\rm BPSK}$ . They coincide because the two metrics (joint and product) coincide on this special structure. Gray QPSK is the only non-trivial constellation-labelling pair with this exact-matching property. $\blacksquare$

ex-ch07-11

Hard

Derive the first-order asymptotic expansion of $I^{\mathrm{GMI}}_{\rm BICM}(s)$ around $s = 1$ for Gray-QAM on AWGN at high SNR. Show that $I^{\mathrm{GMI}}(s) \approx I^{\mathrm{GMI}}(1) + c_1 (s-1) + c_2 (s-1)^2/2$ with $c_1 = 0$ (saddle-point at $s = 1$ ) and $c_2 < 0$ (concavity). Interpret $c_2$ as a curvature of the GMI at $s = 1$ .

Show Hint

Differentiate the GMI formula twice in $s$ .

Use the asymptotic per-position-channel symmetry to force the first derivative to vanish at $s = 1$ .

The second derivative is the variance of the log-metric-ratio, which is negative.

Solution

First derivative

$\frac{\partial I^{\mathrm{GMI}}}{\partial s} = \mathbb{E}_{Y, B}[\log p_{W_\ell}(Y\mid B)] - \mathbb{E}_Y[\frac{\sum_b p_{W_\ell}(Y\mid b)^s \log p_{W_\ell}(Y\mid b)} {\sum_b p_{W_\ell}(Y\mid b)^s}]$ . At $s = 1$ , the second term is $\mathbb{E}_Y[\frac{\sum_b p_{W_\ell}(Y\mid b) \log p_{W_\ell}(Y\mid b)} {\sum_b p_{W_\ell}(Y\mid b)}]$ — the conditional entropy under the mixture distribution. For symmetric binary channels (high-SNR Gray QAM limit), this equals $-H(B\mid Y)$ evaluated on the $W_\ell$ - channel, and the full first derivative reduces to $I(Y; B_\ell) - I(Y; B_\ell) = 0$ .

Second derivative

$\frac{\partial^2 I^{\mathrm{GMI}}}{\partial s^2}\Big|_{s=1} = -\mathrm{Var}_Y[\log p_{W_\ell}(Y\mid \bar B) - \log p_{W_\ell}(Y\mid B)]$ under uniform $B$ . This is a negative quantity (strict unless the per-position channel is noiseless or useless), giving $c_2 < 0$ .

Interpretation

The curvature $c_2 < 0$ quantifies how sharply the GMI decays away from $s = 1$ . At high SNR $|c_2|$ is small (curve nearly flat near the peak) and tuning $s$ gives negligible rate gain. At low SNR $|c_2|$ is larger and the peak is narrower — but also shifted away from $s = 1$ , so tuning $s$ gains a bit of rate. This explains why the GMI scaling payoff is largest at low SNR. $\blacksquare$

ex-ch07-12

Medium

Using the parallel-channel interpretation of §1, derive an expression for the BICM cutoff rate as a function of the per-position Bhattacharyya parameters $\{\beta_\ell\}$ . Compute the cutoff rate for 16-QAM Gray at 6 dB, using $\beta_0 = \beta_2 \approx 0.1$ , $\beta_1 = \beta_3 \approx 0.25$ (approximate values).

Show Hint

$R_0^{\rm BICM} = \sum_\ell [1 - \log_2(1 + \beta_\ell)]$ from Exercise Eex-ch07-06.

Solution

Formula

$R_0^{\rm BICM} = \sum_{\ell=0}^{L-1} [1 - \log_2(1 + \beta_\ell)]$ .

Plug in

For 16-QAM Gray ( $L = 4$ ): two bits with $\beta = 0.1$ give $[1 - \log_2(1.1)] \times 2 = [1 - 0.138] \times 2 = 1.72$ . Two bits with $\beta = 0.25$ give $[1 - \log_2(1.25)] \times 2 = [1 - 0.322] \times 2 = 1.36$ . Total: $R_0^{\rm BICM} \approx 3.08$ bits/symbol.

Comparison

At 6 dB the 16-QAM capacity is about $3.3$ bits/symbol, so $R_0^{\rm BICM}/C_{\rm BICM} \approx 0.93$ — i.e., the cutoff rate is 93% of capacity at this SNR. At higher SNR this ratio drops (cutoff rate saturates below $\log_2 M$ while capacity grows). $\blacksquare$

ex-ch07-13

Hard

Prove that for any memoryless channel and any labelling $\mu$ , the BICM GMI at any $s > 0$ is bounded above by the CM capacity: $I^{\mathrm{GMI}}_{\rm BICM}(s) \le C_{\rm CM}$ . Hence $\sup_s I^{\mathrm{GMI}}(s) \le C_{\rm CM}$ — i.e., the GMI framework does not recover the CM capacity, only a lower bound.

Show Hint

Use the data-processing inequality: any deterministic function of $X$ has less mutual information with $Y$ than $X$ itself.

The BICM product metric is a deterministic function of the joint symbol likelihood.

Solution

GMI $\le$ matched MI

A fundamental result of mismatched decoding (Ganti-Lapidoth-Telatar 1999, Thm. 1) states that $\sup_s I^{\mathrm{GMI}}(s; q) \le \sup_{P_X} I(X; Y)$ for any metric $q$ . In our setting $P_X$ is fixed to uniform on $\mathcal{X}$ , so the bound becomes $\sup_s I^{\mathrm{GMI}}(s; q_{\rm BICM}) \le I(X; Y) = C_{\rm CM}$ .

Via data-processing

Alternatively: the product metric is a deterministic function of $(y, x)$ , so by the data-processing inequality any functional of the metric (including its log-likelihood ratio used by the decoder) carries at most as much information as $(y, x)$ — hence the achievable rate cannot exceed $I(X; Y) = C_{\rm CM}$ .

Equality condition

Equality $\sup_s I^{\mathrm{GMI}}(s) = C_{\rm CM}$ holds iff the product bit metric is proportional to the joint likelihood $q_{\rm BICM}(y, x) \propto p(y\mid x)$ (up to a codeword- independent function of $y$ ). This is the Gray-QPSK case — and only that case — in our setting. For all other $(\mathcal{X}, \mu)$ the GMI is strictly below $C_{\rm CM}$ . $\blacksquare$

ex-ch07-14

Medium

State and sketch the random-coding exponent $E_r(R)$ as a function of $R$ for a BI-AWGN channel at 3 dB. Identify the critical rate $R_{\rm crit}$ (below which the exponent is the straight-line "expurgated" exponent) and the cutoff rate $R_0$ .

Show Hint

The $E_r(R)$ curve has two regimes: (i) straight line of slope $-1$ for $R < R_{\rm crit}$ , (ii) convex curve for $R_{\rm crit} < R < C$ .

$R_{\rm crit}$ is where the two regimes meet; for BI-AWGN at 3 dB, $R_{\rm crit} \approx 0.6 C$ .

Solution

Two-regime structure

$E_r(R)$ is the max over $\rho \in [0, 1]$ of $E_0(\rho) - \rho R$ . The optimum $\rho^\star(R)$ decreases from $1$ (at small $R$ ) to $0$ (at $R = C$ ). Below the critical rate $R_{\rm crit}$ the optimum is $\rho^\star = 1$ , and $E_r(R) = R_0 - R$ — a straight line. Above $R_{\rm crit}$ , $\rho^\star < 1$ and $E_r(R)$ is a convex function reaching $0$ at $R = C$ .

Identify $R_0$

At $R = 0$ , $E_r(0) = E_0(1) = R_0$ . So $R_0$ is the value of the exponent at zero rate — the "exponential floor" of the exponent function.

Numerical sketch

For BI-AWGN at 3 dB: $C \approx 0.88$ bits, $R_0 \approx 0.82$ bits, $R_{\rm crit} \approx 0.55$ bits. The straight-line part goes from $(0, R_0)$ to $(R_{\rm crit}, R_0 - R_{\rm crit})$ ; the convex part from $(R_{\rm crit}, R_0 - R_{\rm crit})$ down to $(C, 0)$ . This is the standard "two-hump" Gallager exponent shape. $\blacksquare$

ex-ch07-15

Hard

Modern 5G NR LDPC decoders operate at $\sim 0.3$ dB from the Shannon capacity. Given that BICM-QPSK at 3 dB has $R_0^{\rm BICM} \approx 1.63$ bits/symbol and $C_{\rm BICM} \approx 1.77$ bits/symbol (Example ECutoff Rate of BICM-QPSK at 3 dB), show that the 5G operating point exceeds $R_0^{\rm BICM}$ and explain why this is possible despite Gallager's cutoff-rate theorem.

Show Hint

At 3 dB, Shannon capacity is $\log_2(1+2) \approx 1.58$ bits/dimension = $3.17$ bits/symbol for QPSK (but QPSK only uses 2 bits/symbol).

LDPC complexity is proportional to the number of edges in the factor graph, not codewords.

Solution

5G QPSK rates exceed $R_0^{\rm BICM}$

5G NR QPSK supports code rates up to $\sim 0.95$ , yielding $1.9$ bits/symbol at 3 dB. This exceeds $R_0^{\rm BICM}(3\text{dB}) = 1.63$ bits/symbol by $\sim 0.3$ bits/symbol — clearly above the Gallager cutoff.

Why LDPC+BP is in a different complexity class

Gallager's cutoff-rate theorem applies to maximum-metric decoders — those whose work per decoded bit grows with the number of codeword candidates examined. For a rate- $R$ code of blocklength $N$ , that number is $2^{NR}$ , and the expected work per bit diverges exponentially for $R > R_0$ .

LDPC+BP is not in this complexity class. BP iterates over the edges of the factor graph. For a graph of degree-distribution fixed by the LDPC base graph, the number of edges is $O(N)$ independent of the code rate. The complexity per iteration is therefore $O(N)$ , and convergence to a valid codeword requires $O(\log N)$ iterations in the threshold-scaling regime. Total work is $O(N \log N)$ , polynomial in $N$ at any rate up to the density-evolution threshold — which is above $R_0$ and within $\sim 0.3$ dB of capacity.

What this means for Chapter 5 and beyond

The cutoff rate is a historical milestone in BICM analysis — the tight rate limit when sequential / list decoders were the standard. Modern BICM (with LDPC+BP) breaks through this limit; the operational question has become "can we approach the BICM GMI / CTB capacity?" and the answer for standard-compliant BICM is yes, within $\sim 0.3$ dB. The theoretical cleanup of §3–4 (GMI and random-coding exponent) remains the foundation, but the cutoff rate itself is informative rather than binding. $\blacksquare$

ex-ch07-16

Hard

Using the commit_contribution block of this chapter as a guide, summarise in your own words the four key technical contributions of the Guillén-Martínez-Caire 2008 monograph: (i) BICM as mismatched decoding; (ii) decoder scaling $s$ ; (iii) BICM error exponent; (iv) extensions to fading and MIMO.

Show Hint

See the commit_contribution block in §2.

Use the GMI as the unifying object.

Solution

Contribution (i): BICM is a mismatched decoder

The BICM receiver uses $q_{\rm BICM}(y, x) = \prod_\ell p_{W_\ell}(y\mid b_\ell)$ instead of $p(y\mid x)$ . This is a mismatched metric in the sense of Merhav-Kaplan-Lapidoth-Shamai 1994. The rate achievable by a random code with this mismatched decoder is the GMI $I^{\mathrm{GMI}}(s)$ , not the CM mutual information.

Contribution (ii): decoder scaling $s$

The GMI is parameterised by a scaling $s > 0$ ; $\sup_s I^{\mathrm{GMI}}(s)$ is the largest achievable rate. For Gray-QAM at high SNR, $s^\star = 1$ and $I^{\mathrm{GMI}}(1)$ equals the CTB capacity — elevating the CTB formula from heuristic to rigorous. At low SNR or non-Gray labellings, $s^\star \ne 1$ and tuning $s$ gives a small rate boost.

Contribution (iii): BICM error exponent

Applying Gallager's random-coding machinery to the product metric yields a well-defined BICM exponent $E_r^{\rm BICM}(R)$ , pointwise below the CM exponent $E_r^{\rm CM}(R)$ . Under Gray the gap is small (exponent ratio $\gtrsim 0.85$ ); under SP it is large (ratio $\sim 0.5$ ). The exponent gap quantifies blocklength cost.

Contribution (iv): fading and MIMO extensions

The GMI framework extends cleanly to block-fading channels (outage analysis reduces to per-symbol GMI) and to MIMO-BICM (the bit metric becomes a marginalised log-likelihood). Both are developed in subsequent chapters of the monograph (Ch. 6–8) and provide the theoretical toolkit for the MIMO / BICM-ID analyses of Chs. 8, 10 of this book.

Unifying thread

All four contributions lean on the same mismatched-decoding scaffold: a product metric, a decoder scaling $s$ , a GMI, and a Gallager-exponent refinement. The monograph's coherence is what makes it the canonical reference for BICM. $\blacksquare$

ex-ch07-17

Medium

Prove the identity $R_0 = I^{\mathrm{GMI}}(s=1) - I_1$ , where $I_1$ is a certain "gap term" — or show that this identity is generally false and explain the correct relation between $R_0$ and the GMI.

Show Hint

$R_0 = E_0(\rho = 1)$ , while $I^{\mathrm{GMI}}(s = 1) = E_0'(\rho = 0)$ . These are different point-evaluations of $E_0$ .

The identity $R_0 = I - I_1$ is NOT generally true; $R_0$ and $I$ are related through $E_0$ 's convexity, not a direct subtraction.

Solution

The claim is false in general

The GMI at $s = 1$ is $E_0'(\rho = 0)$ (the initial slope of $E_0$ ) while the cutoff rate is $E_0(\rho = 1)$ (the value of $E_0$ at $\rho = 1$ ). These two are not related by a simple subtraction; they are two different functionals of $E_0$ .

Correct relation

By convexity of $E_0$ with $E_0(0) = 0$ : $R_0 = E_0(1) \le E_0'(0) \cdot 1 = I^{\mathrm{GMI}}(1)$ (integrating the derivative). Equality holds only if $E_0'(\rho)$ is constant in $\rho$ — the degenerate zero-SNR case. At positive SNR, $R_0 < I^{\mathrm{GMI}}(1)$ strictly.

Size of the gap

The gap $I^{\mathrm{GMI}}(1) - R_0$ grows with SNR. At low SNR it is a small fraction of the capacity; at moderate SNR ( $\sim 10$ dB) it is roughly 20%; at high SNR ( $\sim 20$ dB) it grows to 30–40%. There is no closed-form subtraction identity; the gap depends non-trivially on the channel through $E_0(\rho)$ . $\blacksquare$

Exercises

ex-ch07-01

The metric

Log-LLR form

Mismatch

ex-ch07-02

GMI

Matched case

ex-ch07-03

GMI in log-MGF form

Log-MGF is convex

Conclusion

ex-ch07-04

Bhattacharyya at 0 dB

Cutoff rate

Sanity check

ex-ch07-05

Decompose Gray QPSK into two BPSK

GMI at $s = 1$

Saddle-point is $s^\star = 1$

ex-ch07-06

Parallel-channel $E_0$ is additive

Specialise to $\rho = 1$

Apply to BICM

ex-ch07-07

Evaluate $\mathbb{E}[e^{\Lambda_n/2}]$

Product over distinct positions

Tightening via $s \ne 1/2$

ex-ch07-08

Compute the four rates numerically

Verify ordering

Gap in dB

ex-ch07-09

Lagrangian behaviour at $\rho \to 0$

Evaluate at $R = I^{\mathrm{GMI}}(s^\star)$

Conclude

ex-ch07-10

Gray QPSK decomposition

Chain-rule residual is zero

Verify via cutoff rate

ex-ch07-11

First derivative

Second derivative

Interpretation

ex-ch07-12

Formula

Plug in

Comparison

ex-ch07-13

GMI $\le$ matched MI

Via data-processing

Equality condition

ex-ch07-14

Two-regime structure

Identify $R_0$

Numerical sketch

ex-ch07-15

5G QPSK rates exceed $R_0^{\rm BICM}$

Why LDPC+BP is in a different complexity class

What this means for Chapter 5 and beyond

ex-ch07-16

Contribution (i): BICM is a mismatched decoder

Contribution (ii): decoder scaling $s$

Contribution (iii): BICM error exponent

Contribution (iv): fading and MIMO extensions

Unifying thread

ex-ch07-17

The claim is false in general

Correct relation

Size of the gap