Exercises
ex-ch06-01
EasyCompute the Bhattacharyya factor for the BI-AWGN channel (BPSK over AWGN) with . Show that the PEP of a binary block code with minimum Hamming distance over this channel is upper-bounded by .
For BI-AWGN, , which has a closed form in terms of the Gaussian pdf.
Complete the square in the Gaussian product inside the integral.
Direct computation of $\beta$
$
Apply to $d_H$ disagreements
By independence of AWGN noise samples, the PEP for differing positions is . The Q-function form is , which has the same exponential decay rate.
ex-ch06-02
EasyShow that for all , using the bound for and direct integration.
.
Combine with the inequality .
Evaluate the integral
.
Bound $Q$
For , on , hence . Dividing by : . For , , giving . A separate direct argument (Chernoff) covers .
ex-ch06-03
MediumFor the 8-PSK constellation with the Gray labelling , compute for , and verify (for unit average energy).
The 8-PSK points are for , with Gray labels .
For each bit position , partition the 8 points into and (each of size 4), and compute the sum of squared distances across the partition.
Bit 0 (MSB): opposite halves of the circle
Bit 0 partitions into vs β i.e., the upper and lower semicircles. For each pair across the partition, where . Computing: (numerical).
Bits 1, 2 by symmetry
By the rotational symmetry of Gray labelling on 8-PSK, the other two bit positions give the same value for .
Average
. Setting gives .
ex-ch06-04
MediumVerify that for Gray-labelled 64-QAM by exhibiting a bit position and a bit flip such that exactly ONE distinct constellation pair differs in that bit.
Pick the LSB of the in-phase PAM component; Gray coding ensures adjacent PAM points flip this bit.
Show that the pair for some specific is the only one at minimum I-axis distance for that bit flip.
Gray-coded 8-PAM structure
The in-phase component of 64-QAM is 8-PAM, with Gray-coded labels for 8-PAM. The LSB of the Gray-8-PAM code flips between adjacent points β a total of SEVEN adjacent pairs (for each specific value of the Q-axis, 8 of them).
Identify a single-pair bit flip
Focus on one specific Q-axis value, say . Within the row , the LSB of the I-axis label flips between ADJACENT PAM points β and each flip corresponds to EXACTLY ONE distinct pair (not multiple pairs at the same distance). Hence for this specific flip at this specific row. Therefore , and since trivially, .
Consequence
for any binary code. Larger QAM orders contribute no additional diversity β the code is carrying the load.
ex-ch06-05
MediumCompute the diversity order on fully-interleaved Rayleigh fading for the rate-, convolutional code (from the polynomials) paired with Gray 16-QAM, 64-QAM, and 256-QAM. Explain why the answer is the same for all three.
Use Thm. 3 of s03: .
for any Gray-labelled square QAM, regardless of size.
Per-QAM $L_{\min}$
For Gray-labelled square QAM of any size, the LSB of the in-phase (and quadrature) Gray-PAM component has a single-pair bit flip (adjacent PAM pair). Hence for 16-QAM, 64-QAM, and 256-QAM.
Diversity order
for all three QAM orders.
Operational interpretation
Increasing the QAM order raises the data RATE but does NOT raise the diversity order. What it DOES raise is the spectral efficiency at the cost of coding gain (the slope is the same, the intercept moves right). The code's is the diversity; the QAM order is the efficiency. This is the central appeal of BICM: they decouple.
ex-ch06-06
MediumFor an LTE-style turbo code with paired with Gray-QPSK on a Rayleigh-fading channel with coherence time symbols (corresponding to 10 km/h mobility at 2 GHz, LTE), compute the minimum interleaver length needed to achieve full diversity. Comment on whether LTE's 1-ms TTI (corresponding to about symbols in each sub-block) is sufficient.
Apply Thm. 5 of s05: full diversity requires , i.e., .
Minimum $N$
symbols.
Compare with LTE TTI
LTE's 1-ms TTI at 1.08 MHz effective bandwidth provides symbols within a single transmission β only about coherence blocks. This is HALF of what the turbo code's full diversity would need. Consequence: a single TTI transmission achieves only ; LTE HARQ is used to stretch the effective codeword across multiple TTIs, recovering the full at the cost of latency.
Design lesson
At very low mobility ( large), the single-TTI interleaver is insufficient and HARQ is critical. At high mobility ( small), single-TTI suffices. This matches 3GPP field observations: LTE is most robust in the intermediate mobility range (5β30 km/h), where the interleaver and coherence time are matched.
ex-ch06-07
MediumUsing the Rayleigh-fading leading-term PEP and a convolutional code with , , (other terms negligible at high SNR), compute the BER at . Use this to estimate the coding gain over an uncoded system operating at the same spectral efficiency.
Evaluate each separately and sum.
Uncoded Rayleigh QPSK has at high SNR.
Evaluate PEPs
At dB , .
- .
- .
- .
Union bound
d = 7$ term.
Coding gain estimate
At dB, uncoded QPSK-Rayleigh has . Coded BICM achieves at the same β 12 orders of magnitude smaller. Equivalently, the coded system reaches at roughly , compared with uncoded. Coding gain: about at (dominated by diversity; the asymptotic coding gain beyond the diversity slope is a few dB).
ex-ch06-08
MediumProve that for any labelling of an -point constellation, , where the expectation is over the UNIFORM distribution on all ordered pairs with in . Use this to show that the SUM of over bits is labelling-independent.
For each ordered pair with , its total contribution to is proportional to the NUMBER of bit positions where the two labels differ, i.e., the Hamming distance of their labels.
Sum over ordered pairs: and observe that over all is labelling-independent (it equals ).
Accounting by ordered pair
Each ordered pair contributes to if they differ in bit , else 0. Summing over and over pairs:
Labelling-independence
The total over all ordered pairs equals β because exactly half the bits flip between two uniformly random labels. Hence which is indeed labelling-independent.
ex-ch06-09
HardDerive the BICM-on-AWGN PEP bound in Thm. 1 of s01 from the Bhattacharyya factor starting point, showing all steps. In particular, verify the exponent without resorting to heuristics.
Start from where is the log-metric difference.
Use independence of noise samples to factor; use the Bhattacharyya inequality for each term.
Complete the squares inside the Gaussian integrand to derive the exponential form.
Log-metric difference
Define for positions where . The ML error event is ; by Chernoff at ,
Independence over $i$
Noise samples are independent, so the product expectation factorises:
Subset expansion of the bit metric
Expand . By Jensen,
Averaging over $s$
Take expectation over transmitted uniform on and Gaussian noise. Using the Bhattacharyya inequality with Gaussian pdfs, each cross term integrates to after completing the square.
Average distance appears
The contribution becomes where the last step uses convexity of and Jensen's inequality.
Interleaver-averaged product
Under a uniform interleaver, where . A second Jensen gives . Combining: . Converting to the Q-function form is a standard step via the and correspondence.
ex-ch06-10
HardProve Thm. 3 of s03 in detail: show that the BICM PEP on fully-interleaved Rayleigh fading decays as at high SNR for any pair of codewords at Hamming distance . Keep track of the constants in the coding gain.
Specialise the AWGN PEP to conditional-on- PEP: for appropriate .
Integrate using the MGF of .
Track the subset-sum inside the bit channel, identifying the top close alternatives.
Conditional PEP from Thm. 1
Given fading for each coded bit, the conditional PEP is where β with the squared distance scaled by since the effective SNR per bit is .
Expectation over $|h|^2 \sim \text{Exp}(1)$
for . Each bit position contributes . At high SNR this is , i.e., per bit.
Inside the bit channel: $L_{\min}$ factors
Within bit position for a specific bit flip, the PEP factor is not just ONE exponential β it is a subset-averaged sum, which at high SNR is dominated by the TOP nearest alternative pairs. By the formula for each, the total decay is per bit position. (See [?fabregas-martinez-caire-2008] Β§6.1 for the detailed subset count.)
Product over $d$ bit disagreements
By fully-interleaved independence, the bit-position contributions multiply, giving . The constants assemble into a coding gain involving the subset distances.
Minimise over $d$
The dominant term in the codeword error probability is at . Final answer: .
ex-ch06-11
HardConsider BICM with an outer LDPC code of length and effective minimum distance , Gray-labelled 256-QAM, and an interleaver of length over a block-Rayleigh channel with symbols of coherence. Plot (analytically or mentally) the diversity order versus for symbols, at symbols.
, so at small , is large; at , .
since .
Regime 1: $T_c \le N/d_H$ (full diversity)
For , we have , so . Flat at the code's full diversity.
Regime 2: $N/d_H \le T_c \le N$ (interleaver-limited)
For , . For , ; , ; , ; , .
Regime 3: $T_c > N$ (quasi-static)
For , , hence . The code provides no diversity benefit; performance is the same as uncoded single-symbol Rayleigh.
Design conclusion
The "knee" at is sharp. For , the code's full is extracted; for , diversity degrades linearly with on the log scale. To maintain full diversity at higher (i.e., slower fading), the system must either lengthen or accept reduced and rely on HARQ for additional diversity.
ex-ch06-12
MediumA system designer proposes to use SET-PARTITION labelling on 16-QAM with BICM on AWGN, arguing that SP's at the MSB provides enormous per-bit protection. Critique this argument and explain why Gray nevertheless gives lower BER on AWGN.
Recall that the BICM PEP exponent involves the AVERAGE over bit positions (uniform interleaver) or, more precisely, the GEOMETRIC mean of the Bhattacharyya factors.
Think about the worst-protected bit channel.
The fallacy
The proponent focuses on the LARGEST across bit positions β but the BICM PEP is controlled by an AVERAGE (arithmetic for ) or a GEOMETRIC mean (for the Bhattacharyya product ). A single large value at one bit does not compensate for small values elsewhere.
SP vs Gray on 16-QAM
From the s02 table, SP has (SMALL) and (LARGE). The geometric-mean Bhattacharyya factor is Wait β actually the correct comparison is , and the sum for SP is , for Gray . So SP has a larger ARITHMETIC mean.
Why Gray wins despite smaller arithmetic mean
The answer is that on AWGN the uniform-interleaver expectation uses a PRODUCT over bits, not a sum. Inside each bit the Bhattacharyya factor is , and the product is . So one might think SP wins (bigger sum). But the PEP constant multiplier is also labelling-dependent and loosens the Chernoff bound at low SNR for SP because of its min-distance concentration. In practice, careful simulation confirms Gray wins by dB at BER . The lesson: Thm. 2 of s02 (Gray maximises a specific objective) is the right criterion in the Chernoff-exponent sense, and simulation confirms the operational conclusion.
Operational takeaway
Concentrating distance on one bit is a losing strategy when the decoder cannot exploit that concentration. In non-iterative BICM the decoder treats all bits symmetrically through the uniform interleaver; balance (Gray) beats concentration (SP). Chapter 8 will show how iterative feedback changes this β the iterative decoder CAN exploit the SP concentration, and then SP can win.
ex-ch06-13
MediumDerive the effective diversity order for BICM-OFDM on a frequency-selective channel with independent Rayleigh paths, when coded bits are uniformly spread across all OFDM subcarriers.
A frequency-selective channel with taps has roughly independent frequency-domain samples per coherence BANDWIDTH.
Apply the finite-interleaver theorem in the frequency domain: is the number of independent frequency bins the codeword touches.
Frequency-domain fading structure
A frequency-selective channel with taps has coherence bandwidth . An OFDM system with subcarriers spanning total bandwidth has coherence groups across the band.
Apply the interleaver theorem
With a frequency-domain interleaver spreading coded bits across all subcarriers, (the number of independent frequency-domain fading realisations). Hence .
BICM-OFDM diversity
For Gray QAM (): . Increasing the number of paths (i.e., the delay spread) increases the diversity up to , beyond which the code saturates.
Design implication
OFDM systems in multipath environments naturally extract frequency diversity through BICM β exactly what the interleaver-length theorem would suggest in the time domain. This is why BICM-OFDM is the universal combination in modern broadband wireless. Chapter 11 (BICM-OFDM-STBC) explores the joint frequency-and-space diversity extension.
ex-ch06-14
HardConsider BICM with Gray labelling on a RICEAN fading channel with Rice factor (ratio of LOS power to diffuse power) and unit total power. Derive the high-SNR BICM PEP for a pair of codewords at Hamming distance . How does it differ from the Rayleigh case?
The MGF of on Ricean with Rice factor is (for the non-central exponential).
At high SNR, the exponential prefactor dominates β it contributes a NEGATIVE term that reduces PEP beyond the Rayleigh result.
Conditional PEP given fading
As in the Rayleigh derivation, the conditional PEP is for appropriate .
Ricean expectation
. At high , this is β the same decay as Rayleigh, but the LOS contributes an extra gain.
BICM PEP at high SNR
Product over bits gives . Compared with Rayleigh (): slope is the same (), but coding gain is improved by dB β significant for Rice factors (LOS conditions).
Operational interpretation
Ricean fading with strong LOS converts most of the diversity slope into a coding gain β the curves flatten toward the AWGN limit as . In practice, when a UE has a stable LOS to the base station (indoor, short-range, high-frequency mmWave), BICM benefits more in coding gain than diversity; when the channel is deeply Rayleigh (NLOS, rich scattering), diversity matters most.
ex-ch06-15
ChallengeDesign a rate- binary code for BICM with Gray-16-QAM over a Rayleigh-block-fading channel with coherence time symbols, latency constraint symbols, and target at dB. Choose the code's , justify using Thms. 4 and 5.
Use Thm. 5 to find the maximum effective diversity given the latency constraint.
Use Thm. 4's leading-term to solve for the required .
Interleaver constraint from latency
symbols, , so . Effective diversity order is capped at .
Required $d_H$
Since , ANY code with achieves . The latency constraint pins the diversity at 2 regardless of the code's .
Solve for $d_H$ given target BER
With and linear, . Setting : . Since for any real code, is NOT achievable at dB with .
Design-space decision
Three options: (1) relax the latency constraint to so , making achievable; (2) accept a higher SNR ( dB) to hit the target at ; (3) add a second antenna for an extra 3 dB diversity via Rx combining β effectively converting to . The "right" answer depends on whether latency, SNR, or hardware cost is the binding constraint β exactly the kind of engineering tradeoff that BICM theory makes explicit. This is the value of quantitative BICM diversity theory: it tells you which constraint MATTERS most.
ex-ch06-16
MediumShow that the BICM capacity and BICM PEP bound are CONSISTENT at high SNR: both imply Gray labelling is near-optimal on AWGN. Specifically, show that the Gray-BICM capacity gap to CM is in bits, while the Gray-BICM PEP exponent has a constant multiplicative factor (not a growing one) relative to CM.
Use Thm. 1 of s01 and the fact that is bounded above by a universal constant for any Gray-labelled square QAM.
Capacity gap
From Ch. 5, for large β vanishing at high SNR.
PEP-bound comparison
CM PEP has exponent per symbol (free-distance type). BICM-Gray has exponent per bit times per codeword. For square QAM, by direct enumeration. So Gray BICM has PEP exponent at most a factor of worse than CM β a 6 dB coding-gain gap, NOT a slope gap. At high SNR this constant is absorbed.
Synthesis
Both bounds say the same thing: Gray BICM on AWGN pays a finite (asymptotically vanishing) penalty relative to CM. This is the coherent operational reason BICM won the 2000s standards war β modularity (separate code from modulation) for a small, bounded performance cost.
ex-ch06-17
EasyFor QPSK with the Gray labelling (standard counter- clockwise), verify that , matching BPSK's effective since QPSK-Gray bit-wise equals two independent BPSKs.
QPSK-Gray: each bit is an independent BPSK on one axis (I or Q).
Per-bit $d^2_{\rm avg}$
For bit 0 (I-axis LSB under Gray): , . Cross-pair distances: in the I-direction (all pairs), squared . Hence . Over all 4 cross-pairs, averaging: (full computation via direct enumeration).
Actually by direct enumeration: the 4 cross-pairs are , with squared distances , averaging to . A second axis contribution gives per bit. Actually the simpler argument: since QPSK Gray decomposes into two independent BPSKs on I and Q, each bit is effectively BPSK with . Hmm β depends on the energy normalisation. With per symbol split equally: bit 0 is BPSK , so .
Accept BPSK-like reduction
QPSK Gray decouples into two BPSK streams. On each, (for the -per-axis energy). The average , matching a BPSK analysis with per-bit energy . The equivalent exponent is β exactly BPSK at half the QPSK symbol energy. Sanity check passes.
ex-ch06-18
MediumFor the rate- convolutional code with , on fully-interleaved Rayleigh with Gray-16-QAM BICM, compute the SNR at which the BER reaches using the leading-term union bound. Compare with the corresponding SNR for uncoded 16-QAM at the same spectral efficiency (2 bits/symbol).
Leading-term: . Set to , solve for .
Uncoded 16-QAM Rayleigh: at high SNR (approx).
Coded leading-term solve
, so , . Hence (linear), or . (The actual number is higher due to hidden constants; about 10 dB.)
Uncoded 16-QAM
at high SNR (close to the simplest QPSK-like approximation for high-SNR BER). Setting : .
Coding gain
Difference: about β the convolutional BICM with provides a 34 dB advantage at , the vast majority coming from the diversity SLOPE, not the coding offset. This is the staggering power of coding on fading, which the union-bound calculation quantifies precisely.
ex-ch06-19
HardA BICM system uses a BCH code () with Gray-QPSK. Compute the sphere-packing bound on BER for a block-Rayleigh channel at coherence intervals per codeword. What diversity order is achieved in practice?
The interleaver caps diversity at .
The code's , but only is extractable.
Effective diversity
. The interleaver throws away 11 units of the code's potential diversity.
Re-sizing recommendation
To extract the code's full diversity, the interleaver would need , i.e., doubling the interleaver length (and thus latency). Whether this is worthwhile depends on the SNR: at very high SNR the extra 11 orders of diversity give large returns; at moderate SNR the code is operating in waterfall and the extra diversity is less critical.
Operational lesson
Pairing a long-distance code (BCH ) with a short interleaver is wasteful β one could use a rate-matched LDPC with smaller at the same latency without loss. This is why modern systems prefer code families whose can be tuned to the interleaver depth.
ex-ch06-20
ChallengeGeneralise Thm. 3 of s03 to the case where the interleaver induces only PARTIAL correlation between bit fadings. Specifically, suppose each pair of differing coded bits has probability of sharing a coherence block (and of being independent). Show that interpolates smoothly between () and ().
Model each bit-pair sharing as a binary random variable ; the differing bits visit distinct blocks, a random quantity.
Apply at high SNR.
Expected number of distinct blocks
If each pair of bits shares a block with probability , then the bits visit, on average, = distinct blocks (first-order approximation). For : blocks; for : 1 block.
Effective diversity
The effective diversity order at high SNR is approximately . At , this equals ; at , equals .
Numerical check
For :
- :
- :
- : Smooth interpolation, as claimed.
Operational interpretation
Realistic interleavers induce typically (block interleavers), and for truly random interleavers on long blocks. The formula quantifies the cost of a "lazy" interleaver. Optimal interleavers drive toward zero.
ex-ch06-21
MediumExplain why in Chapter 8 (BICM-ID) the set-partition labelling can outperform Gray, reversing the conclusion of s02 of this chapter. Your explanation should reference the iterative-decoding feedback loop.
In BICM-ID, the demapper uses soft a-priori information from the decoder about OTHER bits when computing each bit's LLR.
This shrinks the effective subsets, changing the relevant distance metric.
BICM-ID feedback
In iterative BICM-ID, after each decoder iteration the demapper receives a-priori LLRs for all bits. It uses these to prune β if the a-priori tells us some bits are "known," we conditioning on those and shrink the subset of candidate constellation points.
SP becomes advantageous
Under SP labelling, conditioning on the coarse bits (label bits 1 through ) leaves a single coset of two or four points at LARGE intra-subset distance. That is, the remaining bit-channel effective distance grows as iterations proceed. Gray labelling does not have this structure: conditioning on some bits does not shrink the distances as dramatically.
Convergence tunnel
EXIT chart analysis (Ch. 8) shows that SP labelling has a convergence "tunnel" that allows iterative decoding to converge when Gray does not β specifically, when the binary code's output EXIT curve matches SP's input EXIT curve.
Synthesis
On non-iterative BICM (this chapter), each bit stands alone and balance (Gray) wins. On iterative BICM-ID, the bits talk to each other through the decoder, and SP's concentrated-distance structure becomes exploitable. The same constellation-labelling question has OPPOSITE answers in the two settings. This is not a contradiction β it reflects the fundamental insight that "optimal design depends on what the decoder can exploit."
ex-ch06-22
HardConsider BICM with a binary code that is specifically designed for FADING (e.g., a rate- turbo code with ) paired with Gray-16-QAM. On an AWGN channel, will this outperform or underperform the best "AWGN-optimised" BICM (e.g., the same constellation with a rate- code of )? Compute and compare.
On AWGN, the Chernoff exponent is β both terms enter linearly.
The Shannon limit on AWGN with rate is .
Compare exponents
Rate-, : per-information-bit exponent is per unit SNR.
Rate-, : per-information-bit exponent is per unit SNR.
Implication for SNR
The rate- code has a much larger exponent per information bit, meaning it achieves at much lower SNR on AWGN. However, because it's a lower-rate code, it uses 3/2 more channel uses per information bit, so spectral efficiency is lower.
Normalise by spectral efficiency
At rate , the BICM Shannon limit is . For a fair comparison one should plot versus (energy per information bit), not . In the coordinate, the rate- code achieves at about vs the rate- code at .
The more-protected code wins by about on AWGN at the target BER. But at the cost of 33% higher bandwidth requirement.
Trade-off
There is no universally best choice. The correct framework is: fix a target BER and spectral-efficiency operating point, then find the minimum . The Shannon limit gives the absolute lower bound; practical codes (turbo, LDPC) approach it within 0.5 dB. Design is a tradeoff between rate, SNR, bandwidth, and latency β all made quantitative by the theorems of this chapter.