Exercises
ex-ch03-01
EasyWrite out the partition chain for 16-QAM under the Ungerboeck rule and compute the intra-subset squared distance for (normalise so that ).
Each Ungerboeck partition step doubles the squared intra-subset distance.
For 16-QAM, at unit average energy (the nearest-neighbour spacing of a standard 4x4 grid).
Level 0: full 16-QAM
At unit the nearest-neighbour squared distance of square -QAM is . For , .
Doubling rule
Each Ungerboeck partition step on square QAM doubles the squared distance (checkerboard sublattice). So , , .
Verification
At level each coset has point β no further
splitting. The rate-allocation story will use , i.e.
an antipodal-BPSK-like channel at the bottom level.
ex-ch03-02
EasyConfirm that the total MLC rate at the capacity allocation is . Give the one-line derivation from the chain rule.
Apply the chain rule to .
Use the bijection under partition-based labelling.
Chain rule and bijection
By the chain rule, . Under partition-based labelling the map is a bijection, so .
ex-ch03-03
MediumFor 8-PSK at dB, estimate each of from the interactive plot in s02 and give the capacity-rule rate allocation. What is the total rate and how far below Shannon?
Read the plot at dB.
Compute bits for the Shannon capacity.
Read off the plot
At dB, . The effective per-dimension SNRs are , , . Computing the binary capacities: , , .
Allocation and total
The capacity-rule allocation is , totalling bits/symbol.
Gap to Shannon
Shannon gives bits. The 8-PSK MLC/MSD capacity is thus within bits of Shannon at dB β the modulation-capacity loss of 8-PSK is very small at this SNR, confirming that MLC has essentially reached the Shannon limit.
ex-ch03-04
MediumShow that , where . Interpret the term on the right.
Use the definitions and .
Apply the chain rule .
Use the independence of the a priori label bits: .
Decompose each term
The chain rule gives . Also . Equating and using a priori independence ,
Sum over levels
Summing from to (the term has empty and both sides agree),
Interpretation
The gap is the total information that previous label bits carry about the current label bit given the channel output β that is, the a posteriori correlation between label bits that BICM throws away by treating each bit position as independent. Gray labelling makes this correlation small; partition-based labelling makes it large (which is why MLC/MSD exploits it).
ex-ch03-05
MediumIn MSD with partition-based labelling of 8-PSK, assume the stage-0 code operates with residual BER (pessimistic case). Using the error-propagation bound of Thm thm-msd-error-propagation, give an upper bound on the stage-1 BER in terms of the genie BER . At what value of does the propagation term begin to dominate the genie term, assuming ?
Apply .
Set the two terms equal and solve for .
Apply the bound
. At this extreme value of , the propagation term dominates by two orders of magnitude.
Find the crossover
The propagation term dominates when . So as long as the stage-0 decoder achieves BER below , propagation is negligible at this hypothetical operating point. In practice, at the capacity-rule operating point the stage-0 BER is driven to or better by the outer code, so propagation is essentially absent.
ex-ch03-06
EasyA designer builds an MLC system for 16-QAM at a target aggregate rate of bits/symbol. From the plots in s02, read off approximately how this rate should be split across the four levels when operating at the capacity threshold.
Compute the four values at the SNR where their sum equals .
Match each to its corresponding .
Find the target SNR
From the plot, 16-QAM reaches bits around dB (i.e. ).
Read the per-level capacities
At dB the four binary sub-channel capacities are approximately , , , . Sum: bits, very close to the target.
Allocation
. The level-0 binary code is rate ; the level-3 code is essentially uncoded. This asymmetric split is characteristic of the capacity rule for QAM.
ex-ch03-07
HardSuppose and are independent Bernoulli(1/2) and with . (This is a 4-PAM with the unusual labelling under a specific permutation.) Compute , , and numerically for and compare the sums and .
Compute numerically by integrating over .
The four points are at ; identify the mapping from .
Use and similar for the conditional.
Identify the mapping
Decode . The four labelβpoint pairs are: , , , . So with a non-Gray labelling (adjacent amplitudes differ in both label bits).
Numerical computation (sketch)
At , simulation gives approximately bits, , , . (Numerical values depend on exact integration scheme; the qualitative ordering is robust.)
Comparison
bits (BICM). bits (MLC/MSD, equals ). The MLC gap over BICM is bit β substantial, because the labelling is not Gray. Replacing with Gray labelling would reduce this gap considerably.
ex-ch03-08
EasyState and briefly justify the claim that under Gray labelling of 16-QAM the quantity is small at high SNR.
At high SNR, conditioning on Y almost determines X.
Under Gray labelling, adjacent constellation points differ in a single bit.
High-SNR limit of $I(B_i; B_{<i} \mid Y)$
At high SNR the channel output almost uniquely determines the transmitted point , hence the label vector . When the label is essentially determined by , conditional mutual information tends to zero because both variables become determined functions of .
Role of Gray labelling
Gray labelling ensures that the dominant confusions at moderate SNR (between neighbouring constellation points) flip only one bit at a time. So conditioning on the correctly-decoded other bits does not significantly change the posterior of the remaining bit: the correlation is already weak even at moderate SNR.
Conclusion
The CM-BICM gap is small under Gray labelling, consistent with the interactive plot in s04 showing the two curves nearly coincident.
ex-ch03-09
HardFor the Ungerboeck partition chain of 8-PSK, suppose we use MLC but with a Gray labelling map instead of partition-based. Does the capacity rule still give the same total capacity? Does MSD still achieve it?
Is the map still a bijection?
Does MSD at level see a binary channel of capacity under Gray?
Chain-rule decomposition is labelling-agnostic
For any bijective labelling , the chain rule gives . So the capacity rule always sums to β the total capacity is labelling-independent.
Per-level capacities change
Under Gray labelling the distribution is different β the per-level values change. In particular, under Gray the unconditional values are larger, but the conditional-on-history values are smaller (in aggregate, still summing to the same ).
MSD still works
The MSD algorithm is agnostic to the specific labelling β it computes LLRs by summing over the current coset, regardless of what bits label it. As long as is a bijection and the capacity-rule allocation is used, MSD achieves .
Practical caveat
The partition-based labelling is special in that the first decoded bits pick out a specific set-partitioning coset with known distance structure, making the stage- binary channel analytically tractable. Under Gray labelling the stage- "coset" (preimage of a specific ) is a more complex subset of and LLR computation is less clean. This is why partition-based labelling is the conventional choice for MLC even though the capacity rule holds for any bijection.
ex-ch03-10
MediumA 5G NR system uses -QAM with one LDPC code per MCS (no MLC), and its MCS table specifies one (modulation, LDPC-rate) pair per spectral efficiency. If the designer instead wanted to deploy MLC, how would the MCS table change? Estimate the size of the new table for modulations QPSK, 16-QAM, 64-QAM, 256-QAM.
Count levels per modulation: .
Each level gets its own LDPC rate.
Level counts
QPSK: . 16-QAM: . 64-QAM: . 256-QAM: . Total across modulations: per-level code rates.
Codebook growth
A BICM MCS table for these four modulations has code rates (one per modulation, picked to match at each SNR). An MLC MCS table would require code rates β a 5Γ growth in the codebook, plus the associated storage and verification overhead.
System impact
Each MCS in 5G requires (i) standardised code specification (base graph + lifting + rate-matching), (ii) encoder/decoder hardware support, and (iii) conformance testing. Multiplying this effort by is why 5G NR did not adopt MLC β the capacity gain was judged not worth the engineering cost.
ex-ch03-11
EasyWrite down the complexity of MSD as a function of and the binary-decode complexity , and compare with the complexity of a joint ML decoder over all levels at a given block length .
MSD is sequential; joint ML is exhaustive.
MSD complexity
MSD runs binary decoders sequentially, plus per-symbol LLR computations. Complexity: . The second term is the total LLR work, linear in .
Joint ML complexity
Joint ML evaluates the likelihood for every combination of binary codewords: candidates. Exponential in .
Comparison
For typical block lengths to and rates to bits, joint ML is to evaluations β utterly infeasible. MSD at , , operations is β comfortably within modern hardware budgets.
ex-ch03-12
HardProve that if the label bits are conditionally independent given , then .
Conditional independence means .
Equivalently, .
Translate conditional independence
Conditional independence of given means for all . By the chain rule, . Each term is at most (data processing with respect to extra conditioning, carefully justified when the label bits are a priori independent). Hence .
Apply exercise ex-ch03-04
By the identity in ex-ch03-04, . So .
Remark
Pathological channels (e.g., where is itself a vector with independent components, one per bit) can saturate this bound. For scalar AWGN channels with a single and uniform inputs it almost never does β the bits typically remain conditionally correlated given , so strictly.
ex-ch03-13
MediumThe capacity rule sets for every level. What is the error exponent of MLC/MSD at the aggregate rate (with small)?
Each stage uses a code at rate with block length .
Binary codes at rate have error probability .
A union bound over stages preserves exponential decay.
Per-stage exponent
At stage with rate and block length , the Gallager random-coding exponent bounds the stage- error probability by .
Union bound
The aggregate error probability is at most . The effective error exponent of MLC/MSD is .
Interpretation
The MLC/MSD error exponent is dominated by the worst (smallest- exponent) level β typically the top level where is smallest and the channel is weakest. Designers protect this level with the strongest code (lowest rate, highest redundancy), which exactly matches the capacity rule's allocation.
ex-ch03-14
ChallengeConsider 8-PSK with the Ungerboeck chain and uniform inputs. Derive a closed-form lower bound on at SNR , using the binary-input AWGN channel formula with intra-level squared distance . Compare your bound numerically against the exact at dB using the simulator.
The binary-input AWGN capacity bound is a lower bound on .
.
Compare with the plot output numerically.
Antipodal lower bound
Condition on the specific sub-pair of 8-PSK points selected by the other two level bits. With determining which of two sub-constellations (rotated QPSK's at the minimum intra-level distance ) is used, the worst-case binary channel is antipodal with squared distance . So where .
Numerical comparison
At dB: the antipodal bound gives approximately bit, while the simulator's exact is around bit β the bound is a few tenths of a bit loose (because the sub-constellation is QPSK-like, not antipodal). At dB: bound , exact . At dB: bound , exact .
Remark on the bound's slackness
The antipodal bound is loose by a constant bits across SNRs. A tighter calculation uses the fact that the sub- constellation carries two bits () of uncertainty, not zero β the "pseudo-antipodal" exact integral takes this into account. The operational lesson: even crude BI-AWGN bounds are within bit of the exact for 8-PSK.
ex-ch03-15
MediumA designer argues that MLC should be preferred over BICM because " always." Give three practical counter- arguments a standards committee would accept.
Think about the codebook size, rate adaptation, and interleaver depth.
Counter-argument 1: codebook growth
MLC needs LDPC rates per modulation; BICM needs one. For 5G NR with ranging from to across four modulations, MLC's codebook is roughly larger β more standard tables, more encoder/decoder circuitry, more conformance testing.
Counter-argument 2: rate adaptation
Adaptive modulation picks a new MCS every slot. BICM changes one rate; MLC must change rates jointly, each computed via the capacity rule at the current SNR. The real-time rate-update logic is substantially more complex.
Counter-argument 3: interleaver depth for fading
In frequency-selective or time-selective fading, BICM's single interleaver can spread code bits across uncorrelated fading instances β a diversity gain essential to wireless. MLC's row-structured MSD decoder is incompatible with deep bit interleaving, forfeiting this diversity.
Final remark
On top of these, Gray-BICM's residual capacity gap is small ( bit, usually bit at high SNR), and BICM-ID (iterative demapping) recovers most of it. So the practical capacity loss is negligible, while the complexity saving is substantial.
ex-ch03-16
MediumUsing the interactive MSD error-propagation plot, determine the minimum prior-stage BER at which the effective stage-1 BER is no more than twice the genie BER. Assume 8-PSK.
The effective BER is roughly .
Solve for the critical ratio.
Set up the critical ratio
The "twice genie" condition reads
, i.e.
. Solving,
.
Estimate the ratio for 8-PSK at moderate SNR
For 8-PSK level 1 at the capacity threshold, the squared distance ratio between correct and wrong coset is , corresponding to a dB effective SNR loss. At operating points where , the wrong-coset BER is typically to (read off the plot). The critical ratio is therefore , or equivalently .
Design implication
As long as the stage-0 outer code achieves BER below about , propagation at stage 1 is benign. This is easily achieved by the low-rate LDPC code the capacity rule assigns to level 0.
ex-ch03-17
EasyName three niches where MLC still beats BICM and explain briefly why.
Think about constellations where Gray labelling doesn't exist, and contexts where partition structure is natural.
APSK constellations
Amplitudeβphase shift keying (used in DVB-S2X) has ring structure that prevents a consistent Gray labelling across rings. MLC with partition-based labelling recovers the capacity that BICM leaves on the table.
Lattice coded modulation
Lattices come equipped with a natural partition chain (the sub-lattice sequence). MLC operates at each level on the coset-index binary channel, and the capacity rule here is the fundamental rate-allocation theorem for lattice-coded modulation (Ch. 4).
High-dimensional constellations
For dimensional constellations (e.g., lattice points, Leech lattice slices), the Gray labelling becomes combinatorially awkward while the partition structure remains clean. MLC is the natural framework.
Remark on the QAM workhorse
Outside these niches β for 2D QAM on scalar AWGN β Gray-BICM essentially matches MLC and wins on complexity.
ex-ch03-18
ChallengeDerive the MLC capacity rule from the converse direction: assume an MLC scheme with rates is decodable by MSD with vanishing error probability. Show that for every .
At stage , the decoder sees a binary channel of capacity (conditional on correct history).
Apply the converse to the binary channel coding theorem at each stage.
Handle the conditioning by noting that history errors vanish under the assumed decodability.
Setup: conditional decoding at stage $i$
By assumption, the aggregate error probability . Let be the stage- error event. Then , so each stage's conditional error probability (conditional on correct previous stages) also vanishes.
Fano's inequality at each stage
At stage with conditional error probability , Fano's inequality gives , which tends to zero as . Combined with and the data-processing identity ,
Let $n \to \infty$
Dividing by and letting with , . This holds for every .
Combined with achievability
Combined with the achievability argument in Thm thm-msd-capacity- achieving, the unique optimal rate allocation is , with total .
ex-ch03-19
MediumThe text claims that at low SNR, 8-PSK has bit while saturates to bit almost immediately. Use the approximation for small (low-SNR linearisation of ) to quantify this. Estimate and at dB.
Level 0 has squared distance ; level 2 has .
Compute for each level.
Per-level SNR at $\ntn{snr} = -2$ dB
. Per-level: ; .
Capacity estimates
Low-SNR approximation: bit. For the low-SNR expansion is out of range; the exact binary-input AWGN formula gives bit.
Commentary
At dB 8-PSK is deep in the power-limited regime: level 0 is essentially useless ( bits), level 1 intermediate, level 2 partial ( bits). The total is less than bit β far below the bits/symbol maximum. 8-PSK is a bandwidth- limited modulation operated at power-limited SNR, which is the worst of both worlds; in practice, one would drop to QPSK at this SNR, as 5G NR does.
ex-ch03-20
HardSketch the capacity rule curves and their sum for 8-PSK from to dB, and identify the SNR at which reaches bits/symbol. Compare with the SNR at which uncoded 8-PSK achieves .
Use the interactive plot binary_partition_capacity.
Uncoded 8-PSK at needs dB, or dB.
Read from the plot
bits is achieved around dB (reading the plot). At this SNR the allocation is approximately .
Uncoded 8-PSK benchmark
Uncoded 8-PSK at needs dB, or dB (since for 8-PSK).
Coding gain to capacity
At the same spectral efficiency bits, the MLC/MSD operating point is at dB, an -dB power saving over uncoded 8-PSK at the slightly higher spectral efficiency of bits. This is a very large coding gain β comparable to what TCM (Ch. 2) offered at much lower complexity.
ex-ch03-21
MediumCompare the asymptotic () behaviour of and for 16-QAM. Does the gap vanish, stay constant, or grow?
At high SNR both capacities approach bits.
Look at the rate of approach.
Both approach $\log_2 M$
As , bits/symbol. The constellation is fully resolvable.
Rate of approach
At high SNR both curves saturate exponentially fast: for some constant depending on the minimum distance. The CMβBICM gap at fixed SNR is dominated by the second-order terms in this expansion, which also decay exponentially β so the gap vanishes.
Empirical check
From the plot at dB: , . At dB the gap is below bits. At low SNR (near threshold) the gap is largest, consistent with the plot's peak of a few tenths of a bit around β dB.
ex-ch03-22
ChallengeSuppose a system uses MLC with ideal capacity-achieving codes at every level, and a BICM competitor uses an ideal capacity-achieving code at the BICM rate. For 16-QAM at dB, compute the operational SNR gap between the two schemes (i.e., the dB difference in required SNR to achieve the same spectral efficiency ). Is this gap worth the complexity difference?
Use the interactive plot to find the SNR at which bits (the 16-QAM CM capacity at 10 dB).
Compare with dB.
MLC operating point
At dB, bits. With ideal codes MLC operates exactly at this point.
BICM operating point for the same $\eta$
BICM at needs . From the plot, this is achieved at dB β a dB shift.
Interpretation
The BICM-vs-MLC operational gap is dB for 16-QAM at bits. This is smaller than the typical link-budget margin (β dB) and far smaller than the gap to Shannon ( dB from the modulation loss alone). The capacity-rule gain of MLC is simply not worth the codebook growth β which is why 5G NR is BICM.