DMT Optimality of LAST Codes

Closing the Loop: LAST + MMSE-GDFE Achieves the Zheng-Tse DMT

Section 1 built the LAST codebook. Section 2 built the MMSE-GDFE receiver. This section glues them together and proves the theorem that puts the "L" and the "ST" in "LAST" on equal footing: LAST codes with MMSE-GDFE decoding achieve the Zheng-Tse DMT curve for every (nt,nr)(n_t, n_r) and every rr.

The proof is an exercise in composition. The MMSE-GDFE (Β§2, Thm. TMMSE-GDFE Preserves Mutual Information) converts the MIMO channel into an equivalent effective-AWGN channel with aggregate capacity equal to the MIMO mutual information. The Erez-Zamir analysis (Ch. 16) says that on this effective channel, random- lattice codes achieve the capacity. So the error event of the LAST code on the equivalent channel is the outage event β€” the event that the channel mutual information falls below the rate. But the outage exponent is the Zheng-Tse DMT (by Zheng-Tse's definition). Thus the LAST code's error exponent equals dβˆ—(r)d^*(r).

The only subtlety is the interaction between random-coding averaging (over Ξ›c\Lambda_c and over the dither d\mathbf{d}) and channel-outage averaging (over H\mathbf{H}). The point is that these two sources of randomness can be separated: first condition on H\mathbf{H} (giving an Erez-Zamir lattice-AWGN channel), then average over H\mathbf{H} to get the channel outage. The chain rule keeps the bookkeeping straight, and the Laplace method of Ch. 12 gives the Wishart eigenvalue exponent dβˆ—(r)d^*(r).

,

Theorem: LAST + MMSE-GDFE Achieves the Zheng-Tse DMT (El Gamal-Caire-Damen 2004)

Let CM=Ξ›c,M∩V(Ξ›s,M)\mathcal{C}_M = \Lambda_{c,M} \cap \mathcal{V}(\Lambda_{s,M}) be a sequence of LAST codes of block length Tβ‰₯ntT \ge n_t with rate scaling as R(SNR)=rlog⁑2SNRR(\text{SNR}) = r \log_2 \text{SNR} for fixed multiplexing gain r∈[0,min⁑(nt,nr)]r \in [0, \min(n_t, n_r)], and let the fine lattice Ξ›c,M\Lambda_{c,M} be drawn from the Minkowski-Hlawka random-lattice ensemble. Let the receiver be the MMSE-GDFE + layer-by-layer lattice decoder of AMMSE-GDFE + Layer-by-Layer Lattice Decoder. Then on the ntΓ—nrn_t \times n_r i.i.d. Rayleigh block-fading channel, the average codeword error probability PΛ‰e(SNR)\bar{P}_e(\text{SNR}) satisfies lim⁑SNRβ†’βˆžβˆ’log⁑PΛ‰e(SNR)log⁑SNRβ€…β€Š=β€…β€Šdβˆ—(r)β€…β€Š=β€…β€Š(ntβˆ’r)(nrβˆ’r)\lim_{\text{SNR} \to \infty} - \frac{\log \bar{P}_e(\text{SNR})} {\log \text{SNR}} \;=\; d^{*}(r) \;=\; (n_t - r)(n_r - r) for every r∈{0,1,…,min⁑(nt,nr)}r \in \{0, 1, \ldots, \min(n_t, n_r)\} (interpolated piecewise-linearly between integers). In words, LAST codes achieve the Zheng-Tse DMT curve.

The point is that the MMSE-GDFE reduces the MIMO decoding problem to lattice decoding on an equivalent AWGN channel; the Erez-Zamir analysis says lattices achieve capacity on that AWGN channel; and the channel outage β€” the event that the random H\mathbf{H} has mutual information less than the rate β€” has probability decaying as SNRβˆ’dβˆ—(r)\text{SNR}^{-d^*(r)} by the Zheng-Tse Wishart-Laplace analysis (Ch. 12). The LAST error exponent equals the outage exponent, which is dβˆ—(r)d^*(r).

,

Operational Interpretation of the Theorem

The theorem tells us something more than "LAST codes are DMT-optimal." It tells us that any lattice with adequate coding gain, when combined with MMSE-GDFE, achieves the DMT. The designer is free to pick the lattice β€” and the 2008 Kumar-Caire result (Β§4) says: pick the densest one for the best finite-SNR performance.

To see why this is the operational message, notice that the proof uses only two properties of the lattice ensemble: (i) Minkowski-Hlawka existence, i.e., there are lattices achieving density ∼2βˆ’n\sim 2^{-n} (Ch. 15); (ii) the crypto-lemma dither argument, which works for any shaping lattice with round Voronoi regions. These are not restrictive β€” they are satisfied by Zn\mathbb{Z}^n, DnD_n, E8E_8, the Leech lattice, Barnes-Wall, and indeed any "generic" lattice. Therefore the theorem applies immediately to structured LAST codes with inner lattice E8E_8 or Leech β€” which is what Β§4 proves formally.

This is a design lesson worth pausing on. The algebraic CDA-NVD proof of Ch. 13 works for one lattice β€” the one embedded in the cyclic division algebra β€” and breaks if you try to substitute a different one. The lattice-theoretic LAST proof works for every sufficiently dense lattice. As a designer, you get to optimise the lattice for finite-SNR coding gain without losing DMT-optimality. That is precisely the freedom Kumar and Caire exploit in Β§4.

Theorem: The DMT Is the Outage Exponent of the MIMO Channel

On the ntΓ—nrn_t \times n_r i.i.d. Rayleigh block-fading channel of block length Tβ‰₯ntT \ge n_t, the outage-probability exponent with respect to rate R=rlog⁑2SNRR = r \log_2 \text{SNR} equals the Zheng-Tse DMT: Pr⁑(IMIMO(H)<R)β€…β€Šβ‰β€…β€ŠSNRβˆ’(ntβˆ’r)(nrβˆ’r)β€…β€Š=β€…β€ŠSNRβˆ’dβˆ—(r).\Pr\bigl(I_{\mathrm{MIMO}}(\mathbf{H}) < R \bigr) \;\doteq\; \text{SNR}^{-(n_t - r)(n_r - r)} \;=\; \text{SNR}^{-d^*(r)}. In words, the outage-exponent lower bound is tight: the best code achieves exactly the outage probability asymptotically.

This is the converse side of LAST's achievability. It says that no code can do better than outage β€” the channel is the fundamental bottleneck, and the code's job is only to realise the outage-floor performance. Every DMT-optimal code (Alamouti at r=0r = 0, CDA-NVD, LAST, etc.) realises the same exponent via a different mechanism, but the ceiling is the same. The theorem is why "DMT-optimality" is a universal benchmark and not a code-specific quantity.

,

Example: DMT Curves for 2Γ—22 \times 2, 2Γ—42 \times 4, and 4Γ—44 \times 4 MIMO

Tabulate the Zheng-Tse DMT curve dβˆ—(r)=(ntβˆ’r)(nrβˆ’r)d^*(r) = (n_t - r)(n_r - r) at integer multiplexing gains r=0,1,…,min⁑(nt,nr)r = 0, 1, \ldots, \min(n_t, n_r) for the following configurations and sketch the piecewise-linear curves: (a) 2Γ—22 \times 2; (b) 2Γ—42 \times 4; (c) 4Γ—44 \times 4. For a LAST code at r=1r = 1, what diversity order does it achieve in each case?

,

Common Mistake: DMT Is an Asymptotic Benchmark β€” Do Not Over-Interpret at Finite SNR

Mistake:

A reader may conclude that a DMT-optimal code (CDA-NVD, LAST, or any other) automatically outperforms a non-DMT-optimal code (plain V-BLAST, zero-forcing) at every SNR and every rate. In particular, one might expect a structured LAST code at r=2r = 2 on a 4Γ—44 \times 4 channel to outperform ZF-V-BLAST at any operating SNR.

Correction:

DMT is an asymptotic (SNRβ†’βˆž\text{SNR} \to \infty) benchmark. At finite SNR, the coding gain β€” the shift of the error- probability-vs-SNR curve β€” matters more than the slope (the DMT). A structured LAST code with E8E_8 inner lattice may have a 33 dB coding-gain advantage over ZF-V-BLAST but only start showing this advantage at BER≀10βˆ’3\text{BER} \le 10^{-3}. Below that SNR, ZF-V-BLAST (which is simpler and has lower decoder complexity) may be preferable. The DMT tells you the slope of the log-BER vs. log-SNR curve at asymptotic SNR; the coding gain tells you the intercept. Both matter in deployment, and a good design optimises both. This is exactly the reason the 2008 Kumar-Caire paper (Β§4) was needed: the 2004 LAST paper gave DMT-optimality (the slope), but structured LAST gives coding gain (the intercept). Together they are the full story.

,