Exercises
ex-ch22-01
EasyCompute the Marzetta-Hochwald non-coherent pre-log for , , . Compare to the coherent pre-log.
.
Effective stream count
.
Pre-log
.
Coherent comparison
Coherent pre-log = . Non-coherent loses 33% of the high-SNR degrees of freedom.
ex-ch22-02
EasyA URLLC system operates at 10 dB SNR, blocklength, target BLER . Use the Polyanskiy normal approximation to estimate the maximum achievable rate.
in nats; convert to bits.
.
Capacity
; bits/use.
Dispersion
.
Polyanskiy rate
bits/use. 16% below Shannon.
ex-ch22-03
EasyFor the GN model with mW and nonlinear efficiency mW, compute the optimal launch power and peak SNR.
.
Optimal power
mW dBm.
Peak SNR
Plug into formula: ( dB).
Rate per polarisation
bits/symbol.
ex-ch22-04
MediumAn autoencoder trained on a Rapp HPA with smoothness , IBO 3 dB delivers a 0.8 dB coding gain. When deployed on a real HPA with and IBO 2 dB, experimental measurements show the gain drops to 0.2 dB. Explain.
The training distribution and deployment distribution differ.
Distribution shift
The learned constellation geometry is optimised for the specific training HPA parameters. Different and IBO change the distortion function, so the compressed constellation is no longer matched.
Why 0.2 dB survives
The autoencoder learned to REDUCE outer points β a general principle that still helps on any nonlinear HPA, but with smaller gain when the specific nonlinearity doesn't match.
Remediation
Train with domain randomisation (sample and IBO dB during training) to make the learned constellation robust across the deployment envelope.
ex-ch22-05
MediumDerive the dispersion formula for the real AWGN channel.
Information density per use: .
Compute at capacity-achieving Gaussian input.
Capacity-achieving input
, , . , .
Information density
.
Variance
Direct computation of using Gaussian: (in nats). Multiply by to convert to bits.
ex-ch22-06
MediumCompare the capacity of a MIMO system: coherent (perfect CSI) vs non-coherent at .
Coherent pre-log = 2.
Non-coherent needs ; here exactly.
Effective streams
.
Pre-log
Non-coherent: . Coherent: .
Interpretation
Half the DoF are lost for . To recover the coherent rate, the system would need longer coherence time ().
ex-ch22-07
MediumShow that the Polyanskiy normal approximation is tight at but loose at small . Explicitly, give a non-trivial lower bound on valid at moderate .
Look up the meta-converse bound (Polyanskiy 2010, Thm. 27).
Asymptotic tightness
The normal approximation matches the meta-converse upper bound up to . At : .
Moderate-$n$ looseness
At , the term is bits/use β comparable to the SNR-dependent term. More accurate bounds (Polyanskiy-Wang 2010 " achievability") retain this correction.
Take-away
For rigorous URLLC design, use the explicit meta-converse bound or a tight saddle-point approximation. Normal approx is quick but can be off by 0.1-0.2 bits/use.
ex-ch22-08
MediumExplain why the optical fibre channel has no classical Shannon-type capacity theorem, despite being a physical channel with noise and bandwidth.
The channel is nonlinear (Kerr effect) and has memory (chromatic dispersion).
Shannon's assumptions
Shannon's AWGN theorem assumes LINEAR channel with MEMORYLESS Gaussian noise. Kerr nonlinearity violates linearity; dispersion violates memorylessness.
GN model as linearisation
The GN model (Essiambre 2010) approximates the nonlinearity as ADDITIVE Gaussian noise with power proportional to . Under this approximation, Shannon's applies β but the effective SNR PEAKS and then decreases.
Beyond GN
At higher launch powers, the GN model fails: nonlinear interference is NOT Gaussian, and capacity is likely higher than GN predicts (digital back-propagation exploits this). The true capacity is an open research question.
ex-ch22-09
HardAn autoencoder is trained on AWGN with 16 messages, 2 channel uses, and SNR 5 dB. After training, it is deployed on the same channel at SNR 10 dB. Would you expect the BER to improve, stay the same, or worsen relative to a hand-designed 16-QAM?
The autoencoder learned a constellation specific to its training SNR.
What was learned
At 5 dB training SNR, the autoencoder may learn a constellation with SLIGHTLY reduced minimum distance (at low SNR, fewer points in dense regions matter).
At 10 dB deployment
Hand-designed 16-QAM benefits from the 5 dB SNR boost fully. The autoencoder's trained-at-low-SNR constellation may have suboptimal minimum distance, so the BER gain at 10 dB is less than at 5 dB.
Net effect
Typically: the autoencoder's AWGN gain DIMINISHES as the deployment SNR diverges from the training SNR. For best robustness, train over a range of SNRs. This is "SNR-robust autoencoder training" (Cammerer et al. 2019).
ex-ch22-10
HardProve that for the non-coherent block-fading MIMO channel with , the Zheng-Tse (2002) non-coherent DMT equals the coherent DMT.
Zheng-Tse prove that the 'CSI penalty' vanishes when .
Coherent DMT (Ch 12)
for integer .
Non-coherent at large $T$
Zheng-Tse (2002) show that for , the non-coherent DMT matches the coherent DMT: no penalty.
Intuition
When the coherence block is long enough, the receiver can "estimate" the channel essentially for free by sacrificing of the slots to pilots. The remaining slots achieve the coherent rate. If , the pilot overhead becomes negligible in the high-SNR DMT exponent.
ex-ch22-11
HardFor 800G coherent optical links at 128 GBaud with 64-QAM PAS shaping, estimate the reach on SMF-28e fibre assuming mW/span and mW/km.
Scale and by span count .
Per-span values
Assume 80 km/span. Full-link , .
Target rate
PAS-shaped 64-QAM: ~5.5 bits/symbol/pol. Dual-pol: 11 bits/symbol. Need at peak.
Solve for $N_{\rm span}$
is INDEPENDENT of (both scale linearly). But .
Reach estimate
Target requires . But each span is 80 km, so reach km (1 span) β 800G is metro only under this crude model. (Real systems with DBP reach 200-400 km.)
ex-ch22-12
HardAn autoencoder trained with MSE loss vs cross-entropy loss will learn different encoders. Explain the difference and which is preferred for communication.
MSE optimises the WAVEFORM; CE optimises the BIT PROBABILITIES.
MSE loss
Minimising over the encoder/decoder pair produces Gaussian-like signalling β close to Shannon's capacity but SOFT bit decisions instead of hard detection.
Cross-entropy loss
Minimising over symbols gives a DISCRETE constellation (because one-hot labels guide the output toward argmax behaviour). The learned constellation resembles QAM in shape with Gray labelling.
Which is better?
For COMMUNICATION with a hard-decision output, cross-entropy gives the "right" geometry (discrete constellation with Gray). For SOFT-decoded channel coding, MSE can be competitive. In practice, CE is preferred.
ex-ch22-13
HardAn old proverb in information theory says that "every joint input- output constraint either adds a penalty or cuts a constant off the pre-log." Apply this to the non-coherent model: compared to a fully-coherent CSI-known system, what is the COST of not knowing the channel?
Compare the non-coherent pre-log with coherent .
Pre-log gap
The pre-log gap is β a loss of bits/channel use per at high SNR.
Constant cost
In addition, non-coherent decoding incurs a offset in the log-rate due to the random decoder not knowing which Grassmannian direction the signal takes.
Total cost of non-coherence
. As , this vanishes β non-coherent converges to coherent. For small , the penalty is significant.
ex-ch22-14
HardThe book's golden thread β that every chapter establishes a code design criterion and a construction β fails in Ch 22. Why? What structure would a future Ch 22 use to be a proper design chapter?
Open problems don't yet have design criteria.
Open problems lack definitive design criteria
For non-coherent STC, finite-blocklength URLLC, autoencoder codes, and optical fibre, the design criteria are still being formulated. We have BOUNDS (Polyanskiy, Marzetta-Hochwald, nonlinear Shannon peak) but not CONSTRUCTIVE theorems.
What a future 'design' chapter would look like
A mature design chapter for each area would include: (1) a formal design criterion (like rank+det for STC); (2) an explicit construction (like CDA for DMT); (3) a deployed example. Today we have (1) in some areas, (2) partially, and (3) only for PAS in optical.
Take-away
Ch 22 is a research survey BECAUSE the field is still being built. The book ends here precisely where the reader would need to contribute to close the loop.
ex-ch22-15
ChallengeOpen research: suggest a specific research direction that combines two of the book's landmark results (e.g., CDA codes of Ch 13 and PAS of Ch 19) and would be a natural PhD thesis topic.
Think about gaps in the current literature.
Example direction
PROBABILISTICALLY-SHAPED CDA CODES: extend Ch 13's CDA framework (DMT-optimal, non-vanishing determinant) to support probabilistic input distributions (Ch 19 PAS). The key question: does the NVD property extend when symbol probabilities are non-uniform?
Why it's PhD-scale
Requires: (a) algebraic analysis of CDA codeword determinants under non-uniform symbol distributions; (b) analogue of the approximate-universality theorem for shaped inputs; (c) finite- SNR performance comparisons with hand-designed shaped constellations.
Practical impact
A shaped CDA code would unify the MIMO and single-carrier paradigms: DMT-optimal (like Golden code) + near-capacity (like PAS). This would redefine MCS design for high-SNR MIMO (Wi-Fi 7, 5G mmWave, optical PAS).
Alternative directions
Other natural combinations: (i) LAST codes for URLLC (short blocks + DMT-optimality); (ii) autoencoder-learned CDA initialisation (neural search over the CDA manifold); (iii) ARQ- LAST for non-terrestrial networks (long RTT URLLC + MIMO).