Exercises
ex-ch07-01
EasyState the BICM product bit metric for an arbitrary constellation of size with labelling , and write down the decoder decision rule in terms of per-symbol log-LLRs . Why is this metric "mismatched"?
Use Definition DThe BICM Product Bit Metric.
Log the product; express in terms of bits that are 1.
Compare with the true likelihood .
The metric
where is the -th label bit of .
Log-LLR form
The decoder maximises the sum of log across all received symbols over the codebook — equivalently, minimises .
Mismatch
The true likelihood is , which encodes the joint Euclidean geometry of relative to . The product metric only uses the marginal per-position laws , which average over the other bits. For this averaging destroys information; the metrics differ on a set of positive probability.
ex-ch07-02
EasyState the definition of the generalised mutual information for a channel with uniform input and mismatched metric . Specialise to the matched case , , and verify you recover Shannon's mutual information .
GMI
.
Matched case
At , : the numerator is and the denominator is . So , the classical mutual information.
ex-ch07-03
MediumProve from first principles that is concave in for the BICM product metric with uniform input. (Hint: rewrite the GMI as an expectation of a log-moment-generating function, which is convex in the moment parameter.)
Write after using uniformity.
Differentiate twice in and use the Cauchy-Schwarz / log-convexity property of moment-generating functions.
GMI in log-MGF form
s-\mathbb{E}Y \log \mathbb{E}B[p{W\ell}(Y\mid B)^s]\log p_{W_\ell}(Y\mid B)B$.
Log-MGF is convex
For any random variable , is convex in (cumulant-generating function). So is convex in , and its negative is concave in . The linear first term preserves concavity, and summing over preserves it.
Conclusion
is the sum of concave-in- terms, hence concave in . Its supremum is attained at an interior saddle-point satisfying .
ex-ch07-04
EasyCompute the Bhattacharyya parameter and the cutoff rate of a BI-AWGN channel at dB. Report both in bits.
Recall .
Use for binary-input channels.
Bhattacharyya at 0 dB
dB linear, so .
Cutoff rate
bits per code bit. For comparison, BI-AWGN capacity at 0 dB is bits — wait, this is below , which seems wrong.
Sanity check
Let me recompute: actually at dB linear, the BI-AWGN capacity is about bits (not ; I conflated with a different SNR). So , consistent with .
Correction note: the original reasoning's numerical hiccup shows why sanity checks () are important. The correct numerical values at 0 dB SNR for BI-AWGN are , bits, bits.
ex-ch07-05
MediumDerive the GMI expression for the BICM product bit metric in the special case of QPSK with Gray labelling on AWGN. Show that the GMI at equals twice the BI-AWGN capacity at per-bit SNR equal to the symbol SNR.
Gray QPSK has independent BI-AWGN bit channels with equal per-bit SNR.
Plug the BI-AWGN transition law into the GMI formula for BICM at .
Decompose Gray QPSK into two BPSK
Under Gray, the two BICM bit channels are both BI-AWGN with per-bit SNR equal to the symbol SNR (as in Example EQPSK with Gray Labelling: Two Identical Parallel BI-AWGN Channels).
GMI at $s = 1$
From Thm. Equals the CTB Capacity" data-ref-type="theorem">TBICM GMI at Equals the CTB Capacity, . The GMI reduces exactly to twice the BI-AWGN capacity.
Saddle-point is $s^\star = 1$
Since the per-position channels are identical symmetric BI-AWGN, the GMI curve is maximised at (the matched-decoder scaling for BPSK). So . Gray-QPSK has zero BICM-to-CM gap: the mismatched decoder is matched for this constellation.
ex-ch07-06
MediumProve that the cutoff rate of a parallel-channel system is the sum of the cutoff rates of the constituent channels. Use this to derive the BICM cutoff rate formula directly from the parallel-channel structure of §1.
Parallel-channel Gallager function: .
Setting on both sides gives cutoff-rate additivity.
Parallel-channel $E_0$ is additive
For independent parallel channels with Gallager functions , the joint Gallager function of the product channel is — a direct consequence of the factorisation of the product channel's transition law into the product of per-subchannel laws, and the multiplicativity of the integral over the product output space.
Specialise to $\rho = 1$
, i.e., the sum of per-subchannel cutoff rates.
Apply to BICM
The BICM parallel-channel model has per-subchannel Bhattacharyya parameters . Each is binary-input, so . Summing gives .
ex-ch07-07
MediumStarting from the Chernoff bound on the PEP, with the BICM log-metric ratio, derive the Bhattacharyya upper bound as a specialisation at , and express it in terms of per-position Bhattacharyya parameters.
At , .
The product over distinct positions reduces to a Bhattacharyya-factor product.
Evaluate $\mathbb{E}[e^{\Lambda_n/2}]$
, where is the position of bit . Using uniform : .
Product over distinct positions
If is the number of distinct bits at level , the PEP is bounded by . Averaged over the interleaver (which uniformly spreads the distinct bits across the levels), this gives the classical Caire-Taricco-Biglieri union bound of Ch. 6.
Tightening via $s \ne 1/2$
The same derivation with general replaces by the -tilted factor . Minimising over gives the MFC saddle-point bound of Thm. TMartinez-Fàbregas-Caire Saddle-Point PEP Bound.
ex-ch07-08
HardFor 16-QAM with Gray labelling on AWGN at dB, compute numerically: (a) , (b) , (c) , (d) , and (e) the gap in dB of SNR-equivalent. Verify that the ordering holds.
Use numerical Gauss-Hermite quadrature for the AWGN integrals.
At high SNR, scales as corrections.
Compute the four rates numerically
Numerical evaluation at 12 dB, 16-QAM Gray:
- bits/symbol (CM capacity)
- bits/symbol (CTB capacity)
- bits/symbol (CM cutoff rate)
- bits/symbol (BICM cutoff rate)
Verify ordering
— satisfies , consistent with Thm. and " data-ref-type="theorem">T and .
Gap in dB
bits/symbol. At this SNR, the capacity slope of 16-QAM is bits/dB, so the gap is dB of SNR-equivalent. This is the "sequential-decoder penalty" — BICM with a sequential decoder would need 9 dB more SNR to reach . LDPC+BP eliminates this penalty entirely by operating above .
ex-ch07-09
MediumShow that the BICM random-coding exponent vanishes at — i.e., that the BICM exponent is zero at the GMI capacity. Show also that .
At , the Lagrangian is maximised at .
Use and .
Lagrangian behaviour at $\rho \to 0$
. At , the term is . For small, the term is .
Evaluate at $R = I^{\mathrm{GMI}}(s^\star)$
If exactly, the first-order term vanishes and the second-order term ( by concavity of GMI in , but here 's second derivative is which is negative for concave — actually is convex in , so ; the maximum over of at is attained at with value 0).
Conclude
. The derivative at this rate is from the envelope theorem (but the slope in the limit from below is because as decreases below the GMI, the optimal linearly in , giving ).
ex-ch07-10
MediumExplain why the CM cutoff rate and the BICM cutoff rate coincide for QPSK with Gray labelling at any SNR. (Hint: use the QPSK decomposition into two parallel BPSK channels and the absence of chain-rule residual.)
Gray QPSK decomposes exactly into two independent BPSK channels.
The CM and BICM product metrics are identical for this decomposition.
Gray QPSK decomposition
Under Gray labelling, QPSK is exactly two independent BPSK channels (I and Q). The symbol channel factors exactly into because the I and Q noises are independent and each bit controls only one component.
Chain-rule residual is zero
For Gray QPSK, because and are conditionally independent given . The chain-rule gap vanishes, so and similarly .
Verify via cutoff rate
Both cutoff rates are . They coincide because the two metrics (joint and product) coincide on this special structure. Gray QPSK is the only non-trivial constellation-labelling pair with this exact-matching property.
ex-ch07-11
HardDerive the first-order asymptotic expansion of around for Gray-QAM on AWGN at high SNR. Show that with (saddle-point at ) and (concavity). Interpret as a curvature of the GMI at .
Differentiate the GMI formula twice in .
Use the asymptotic per-position-channel symmetry to force the first derivative to vanish at .
The second derivative is the variance of the log-metric-ratio, which is negative.
First derivative
. At , the second term is — the conditional entropy under the mixture distribution. For symmetric binary channels (high-SNR Gray QAM limit), this equals evaluated on the - channel, and the full first derivative reduces to .
Second derivative
under uniform . This is a negative quantity (strict unless the per-position channel is noiseless or useless), giving .
Interpretation
The curvature quantifies how sharply the GMI decays away from . At high SNR is small (curve nearly flat near the peak) and tuning gives negligible rate gain. At low SNR is larger and the peak is narrower — but also shifted away from , so tuning gains a bit of rate. This explains why the GMI scaling payoff is largest at low SNR.
ex-ch07-12
MediumUsing the parallel-channel interpretation of §1, derive an expression for the BICM cutoff rate as a function of the per-position Bhattacharyya parameters . Compute the cutoff rate for 16-QAM Gray at 6 dB, using , (approximate values).
from Exercise Eex-ch07-06.
Formula
.
Plug in
For 16-QAM Gray (): two bits with give . Two bits with give . Total: bits/symbol.
Comparison
At 6 dB the 16-QAM capacity is about bits/symbol, so — i.e., the cutoff rate is 93% of capacity at this SNR. At higher SNR this ratio drops (cutoff rate saturates below while capacity grows).
ex-ch07-13
HardProve that for any memoryless channel and any labelling , the BICM GMI at any is bounded above by the CM capacity: . Hence — i.e., the GMI framework does not recover the CM capacity, only a lower bound.
Use the data-processing inequality: any deterministic function of has less mutual information with than itself.
The BICM product metric is a deterministic function of the joint symbol likelihood.
GMI $\le$ matched MI
A fundamental result of mismatched decoding (Ganti-Lapidoth-Telatar 1999, Thm. 1) states that for any metric . In our setting is fixed to uniform on , so the bound becomes .
Via data-processing
Alternatively: the product metric is a deterministic function of , so by the data-processing inequality any functional of the metric (including its log-likelihood ratio used by the decoder) carries at most as much information as — hence the achievable rate cannot exceed .
Equality condition
Equality holds iff the product bit metric is proportional to the joint likelihood (up to a codeword- independent function of ). This is the Gray-QPSK case — and only that case — in our setting. For all other the GMI is strictly below .
ex-ch07-14
MediumState and sketch the random-coding exponent as a function of for a BI-AWGN channel at 3 dB. Identify the critical rate (below which the exponent is the straight-line "expurgated" exponent) and the cutoff rate .
The curve has two regimes: (i) straight line of slope for , (ii) convex curve for .
is where the two regimes meet; for BI-AWGN at 3 dB, .
Two-regime structure
is the max over of . The optimum decreases from (at small ) to (at ). Below the critical rate the optimum is , and — a straight line. Above , and is a convex function reaching at .
Identify $R_0$
At , . So is the value of the exponent at zero rate — the "exponential floor" of the exponent function.
Numerical sketch
For BI-AWGN at 3 dB: bits, bits, bits. The straight-line part goes from to ; the convex part from down to . This is the standard "two-hump" Gallager exponent shape.
ex-ch07-15
HardModern 5G NR LDPC decoders operate at dB from the Shannon capacity. Given that BICM-QPSK at 3 dB has bits/symbol and bits/symbol (Example ECutoff Rate of BICM-QPSK at 3 dB), show that the 5G operating point exceeds and explain why this is possible despite Gallager's cutoff-rate theorem.
At 3 dB, Shannon capacity is bits/dimension = bits/symbol for QPSK (but QPSK only uses 2 bits/symbol).
LDPC complexity is proportional to the number of edges in the factor graph, not codewords.
5G QPSK rates exceed $R_0^{\rm BICM}$
5G NR QPSK supports code rates up to , yielding bits/symbol at 3 dB. This exceeds bits/symbol by bits/symbol — clearly above the Gallager cutoff.
Why LDPC+BP is in a different complexity class
Gallager's cutoff-rate theorem applies to maximum-metric decoders — those whose work per decoded bit grows with the number of codeword candidates examined. For a rate- code of blocklength , that number is , and the expected work per bit diverges exponentially for .
LDPC+BP is not in this complexity class. BP iterates over the edges of the factor graph. For a graph of degree-distribution fixed by the LDPC base graph, the number of edges is independent of the code rate. The complexity per iteration is therefore , and convergence to a valid codeword requires iterations in the threshold-scaling regime. Total work is , polynomial in at any rate up to the density-evolution threshold — which is above and within dB of capacity.
What this means for Chapter 5 and beyond
The cutoff rate is a historical milestone in BICM analysis — the tight rate limit when sequential / list decoders were the standard. Modern BICM (with LDPC+BP) breaks through this limit; the operational question has become "can we approach the BICM GMI / CTB capacity?" and the answer for standard-compliant BICM is yes, within dB. The theoretical cleanup of §3–4 (GMI and random-coding exponent) remains the foundation, but the cutoff rate itself is informative rather than binding.
ex-ch07-16
HardUsing the commit_contribution block of this chapter as a guide, summarise in your own words the four key technical contributions of the Guillén-Martínez-Caire 2008 monograph: (i) BICM as mismatched decoding; (ii) decoder scaling ; (iii) BICM error exponent; (iv) extensions to fading and MIMO.
See the commit_contribution block in §2.
Use the GMI as the unifying object.
Contribution (i): BICM is a mismatched decoder
The BICM receiver uses instead of . This is a mismatched metric in the sense of Merhav-Kaplan-Lapidoth-Shamai 1994. The rate achievable by a random code with this mismatched decoder is the GMI , not the CM mutual information.
Contribution (ii): decoder scaling $s$
The GMI is parameterised by a scaling ; is the largest achievable rate. For Gray-QAM at high SNR, and equals the CTB capacity — elevating the CTB formula from heuristic to rigorous. At low SNR or non-Gray labellings, and tuning gives a small rate boost.
Contribution (iii): BICM error exponent
Applying Gallager's random-coding machinery to the product metric yields a well-defined BICM exponent , pointwise below the CM exponent . Under Gray the gap is small (exponent ratio ); under SP it is large (ratio ). The exponent gap quantifies blocklength cost.
Contribution (iv): fading and MIMO extensions
The GMI framework extends cleanly to block-fading channels (outage analysis reduces to per-symbol GMI) and to MIMO-BICM (the bit metric becomes a marginalised log-likelihood). Both are developed in subsequent chapters of the monograph (Ch. 6–8) and provide the theoretical toolkit for the MIMO / BICM-ID analyses of Chs. 8, 10 of this book.
Unifying thread
All four contributions lean on the same mismatched-decoding scaffold: a product metric, a decoder scaling , a GMI, and a Gallager-exponent refinement. The monograph's coherence is what makes it the canonical reference for BICM.
ex-ch07-17
MediumProve the identity , where is a certain "gap term" — or show that this identity is generally false and explain the correct relation between and the GMI.
, while . These are different point-evaluations of .
The identity is NOT generally true; and are related through 's convexity, not a direct subtraction.
The claim is false in general
The GMI at is (the initial slope of ) while the cutoff rate is (the value of at ). These two are not related by a simple subtraction; they are two different functionals of .
Correct relation
By convexity of with : (integrating the derivative). Equality holds only if is constant in — the degenerate zero-SNR case. At positive SNR, strictly.
Size of the gap
The gap grows with SNR. At low SNR it is a small fraction of the capacity; at moderate SNR ( dB) it is roughly 20%; at high SNR ( dB) it grows to 30–40%. There is no closed-form subtraction identity; the gap depends non-trivially on the channel through .