Exercises
ex-ch19-01
EasyCompute the Maxwell-Boltzmann distribution for 16-QAM (4-PAM per axis with amplitudes ) at . What is the entropy of the marginal amplitude distribution?
on .
Unnormalised probabilities
(for ). (for ).
Normalisation
Sum = . Probabilities: ; .
Entropy
bits (vs 2 bits uniform).
ex-ch19-02
EasyUsing the PAS rate formula , compute the PAS rate for 64-QAM with LDPC rate and bits (per-axis amplitude entropy).
For 64-QAM, .
Plug in
bits/symbol.
Interpretation
Vs uniform 64-QAM at rate 0.75: . Shaped rate is lower, but requires less SNR to achieve.
ex-ch19-03
EasyCompute the CCDM rate loss at for an 8-PAM (8-ary amplitude) target with entropy bits.
Rate loss is .
Loss formula
bits/symbol.
As fraction of entropy
β modest loss at moderate .
ex-ch19-04
MediumA 400ZR coherent optical link uses 16-QAM with LDPC rate . Estimate the PAS rate adaptation range by varying from 0 to very large.
At : uniform QAM. As : single-point.
Max rate
: . Rate bits/symbol/pol.
Min rate
: . Rate bits/symbol/pol.
Range
PAS sweeps rate from 1.81 to 3.69 bits/symbol/pol continuously β about a 2x span on a single (M, R) pair.
ex-ch19-05
MediumProve the maximum-entropy theorem: the distribution maximises subject to the constraints and .
Lagrangian with two constraints: normalisation and power.
Lagrangian
.
Derivative
, giving .
Identification
(scaling by the power constraint). This is the Maxwell-Boltzmann distribution.
ex-ch19-06
MediumDescribe why the SIGN bits in PAS do not need shaping, while the AMPLITUDE bits do.
MB distribution is symmetric: .
Symmetry of MB
The Maxwell-Boltzmann distribution depends only on β it is symmetric around zero. So the sign of each shaped symbol is ALREADY uniform under MB.
Practical advantage
This means the sign bits can be PRODUCED directly by the LDPC systematic output (which is uniform) without any shaping. Only the amplitudes need the CCDM. This is the key PAS insight: shaping only the amplitudes = FEC-compatibility.
ex-ch19-07
MediumA system uses 256-QAM (16-PAM per axis) with shaping. Compute the per-axis amplitude entropy and the shaping gain.
Amplitudes: (8 positive values scaled).
Probabilities
. Compute for : . Sum = 2.22. Pairs (for ): divide by . Each amplitude has .
Entropy
bits (vs 4 bits uniform for 16-PAM).
Shaping gain
Uniform per-axis entropy = 4 bits; shaped = 3.1 bits. The gap of 0.9 bits/axis corresponds to roughly dB of rate reduction, but the SNR reduction (shaping gain) is much smaller β about 1.2 dB for this parameter choice, as measured numerically.
ex-ch19-08
MediumCompare PAS and discrete MCS adaptation: at 18 dB SNR, a system can either use 64-QAM rate 3/4 (4.5 bits/symbol) or 16-QAM rate 5/6 (3.33 bits/symbol). What continuous PAS rate could achieve the same BER performance at 18 dB?
PAS enables continuous rates between MCS points.
Discrete gap
Between 3.33 and 4.5 bits/symbol, there is a 1.17 bit/symbol quantisation gap at the MCS boundary.
PAS continuous value
With 64-QAM + and a PAS-tuned , the system can achieve any rate between and bits/symbol. In particular, a rate of 3.90 bits/symbol is achievable continuously.
System implication
The MCS gap (3.33 β 4.5) is wasteful β a link that could operate at 3.9 bits/symbol is forced to 3.33. PAS recovers this efficiency.
ex-ch19-09
HardProve that geometric shaping and probabilistic shaping achieve the same mutual information for the SAME set of constellation points and marginal amplitude distribution β i.e., the two approaches are INFORMATION-EQUIVALENT.
MI depends on the JOINT distribution of (x, y), not on which side is modified.
Channel-MI formulation
for a channel . The shaping affects through the marginal of β but only via the DISTRIBUTION of induced by the shaping.
Equivalence
If scheme A (PS) places uniform input on non-uniform point locations and scheme B (GS) places non-uniform input on a uniform grid, and BOTH induce the same marginal distribution of , then exactly.
Practical caveat
In practice the two schemes LOOK different in the encoder but are information-theoretically identical. Hence the "asymptotic equivalence" in Thm. 1 of Β§4. The BER difference between the two comes from the RECEIVER processing, not from the encoder side.
ex-ch19-10
HardThe CCDM rate loss formula is for CONSTANT composition. Show that the rate loss of a HIERARCHICAL DM (multi-stage tree of constant-composition blocks) is smaller at moderate .
Hierarchical DM splits the sequence into smaller sub-blocks.
Hierarchical structure
Split the -symbol output into sub-blocks of length . Apply CCDM separately to each, with target distribution targeted at the sub-block level. The overall rate is close to but with smaller finite- losses.
Rate loss analysis
Per sub-block, the CCDM rate loss is . Summed over sub-blocks: total loss .
Trade-off
As grows, each sub-block is shorter and the local rate loss per sub-block grows like . The optimal scales as . Modern implementations use at .
ex-ch19-11
HardAn autoencoder trained on an AWGN channel at SNR 10 dB is likely to learn a constellation similar to probabilistic-shaped 16-QAM. Why? What does this suggest about the fundamental uniqueness of shaping solutions?
Autoencoder loss (cross-entropy) is equivalent to MI maximisation.
Loss-MI equivalence
The cross-entropy loss equals , which is minimised when is maximised β the same objective as capacity.
Solution convergence
Since MI-maximising constellations on AWGN are MB-shaped QAM (up to unitary rotation), the autoencoder converges to the same solution as analytical PS.
Uniqueness up to rotation
The autoencoder may learn a ROTATED version of MB-shaped QAM because rotation preserves MI. But modulo rotation, the solution is unique. This is a useful sanity check: if the autoencoder doesn't converge to MB on AWGN, the training setup is broken.
ex-ch19-12
HardDerive the 1.53 dB asymptotic shaping ceiling in units of dB using the normalised second moment of a sphere.
The normalised second moment of the -sphere converges to as .
Normalised 2nd moment of n-sphere
as (Zador 1996).
Normalised 2nd moment of n-cube
for all (uniform on cube).
Shaping gain
.
In dB
. In dB: dB.
ex-ch19-13
HardA common criticism of PAS is that it adds encoder/decoder complexity. Quantify the complexity of CCDM encoding at , .
Arithmetic encoding: operations.
Operation count
For each of 1000 positions, the CCDM evaluates 8 candidates: operations per block.
Context
At 400 GBaud: 400,000 blocks/second. Total: operations/second. Well within modern DSP chip capacity (~ ops/s).
Memory
Each block needs a multinomial-lookup table of size β too large for . Practical implementations use floating-point arithmetic coding with memory per block.
ex-ch19-14
HardThe PAS architecture requires a SYSTEMATIC LDPC code. What goes wrong if you use a non-systematic turbo code instead?
Non-systematic codes XOR the systematic bits with parity β they don't preserve the shape.
Non-systematic composition
A non-systematic code outputs with NO IDENTIFIABLE shaped and unshaped bit components. After passing through the non-systematic encoder, the output bit distribution is approximately UNIFORM regardless of input shape.
Consequence
The BICM mapper then sees uniform input bits and produces uniform QAM output β the shaping is LOST.
Fix
Use a SYSTEMATIC code: its output includes the original shaped bits unchanged, and parity bits are added alongside. The uniformly-distributed parity bits serve as SIGN bits in the PAS architecture.
ex-ch19-15
ChallengeOpen research: can PAS be combined with CDA codes (Ch 13) to achieve BOTH the DMT (rank+determinant) and shaping gain simultaneously? Sketch what would need to be proved.
CDA requires uniform input distribution. PAS uses shaped input.
Problem
CDA codes' non-vanishing-determinant theorem (Ch 13) is proved for UNIFORM inputs. If the input is MB-shaped, does the NVD property still hold?
Conjecture
Likely yes, with a tighter constant in , because MB shaping compresses outer points toward the centre β possibly reducing codeword-pair determinants but preserving positivity.
Needs proving
(a) Show the minimum determinant is bounded below by a positive constant under MB input. (b) Characterise the DMT of PAS+CDA β likely still but with modified coding gain. (c) Practical construction: how does the CCDM interact with the CDA codeword construction? This would be a solid PhD thesis topic.