Ferkans — Interactive Telecom Tutor

Closing the Loop: LAST + MMSE-GDFE Achieves the Zheng-Tse DMT

Section 1 built the LAST codebook. Section 2 built the MMSE-GDFE receiver. This section glues them together and proves the theorem that puts the "L" and the "ST" in "LAST" on equal footing: LAST codes with MMSE-GDFE decoding achieve the Zheng-Tse DMT curve for every $(n_t, n_r)$ and every $r$ .

The proof is an exercise in composition. The MMSE-GDFE (§2, Thm. TMMSE-GDFE Preserves Mutual Information) converts the MIMO channel into an equivalent effective-AWGN channel with aggregate capacity equal to the MIMO mutual information. The Erez-Zamir analysis (Ch. 16) says that on this effective channel, random- lattice codes achieve the capacity. So the error event of the LAST code on the equivalent channel is the outage event — the event that the channel mutual information falls below the rate. But the outage exponent is the Zheng-Tse DMT (by Zheng-Tse's definition). Thus the LAST code's error exponent equals $d^*(r)$ .

The only subtlety is the interaction between random-coding averaging (over $\Lambda_c$ and over the dither $\mathbf{d}$ ) and channel-outage averaging (over $\mathbf{H}$ ). The point is that these two sources of randomness can be separated: first condition on $\mathbf{H}$ (giving an Erez-Zamir lattice-AWGN channel), then average over $\mathbf{H}$ to get the channel outage. The chain rule keeps the bookkeeping straight, and the Laplace method of Ch. 12 gives the Wishart eigenvalue exponent $d^*(r)$ .

,

Theorem: LAST + MMSE-GDFE Achieves the Zheng-Tse DMT (El Gamal-Caire-Damen 2004)

Let $\mathcal{C}_M = \Lambda_{c,M} \cap \mathcal{V}(\Lambda_{s,M})$ be a sequence of LAST codes of block length $T \ge n_t$ with rate scaling as $R(\text{SNR}) = r \log_2 \text{SNR}$ for fixed multiplexing gain $r \in [0, \min(n_t, n_r)]$ , and let the fine lattice $\Lambda_{c,M}$ be drawn from the Minkowski-Hlawka random-lattice ensemble. Let the receiver be the MMSE-GDFE + layer-by-layer lattice decoder of AMMSE-GDFE + Layer-by-Layer Lattice Decoder. Then on the $n_t \times n_r$ i.i.d. Rayleigh block-fading channel, the average codeword error probability $\bar{P}_e(\text{SNR})$ satisfies $\lim_{\text{SNR} \to \infty} - \frac{\log \bar{P}_e(\text{SNR})} {\log \text{SNR}} \;=\; d^{*}(r) \;=\; (n_t - r)(n_r - r)$ for every $r \in \{0, 1, \ldots, \min(n_t, n_r)\}$ (interpolated piecewise-linearly between integers). In words, LAST codes achieve the Zheng-Tse DMT curve.

The point is that the MMSE-GDFE reduces the MIMO decoding problem to lattice decoding on an equivalent AWGN channel; the Erez-Zamir analysis says lattices achieve capacity on that AWGN channel; and the channel outage — the event that the random $\mathbf{H}$ has mutual information less than the rate — has probability decaying as $\text{SNR}^{-d^*(r)}$ by the Zheng-Tse Wishart-Laplace analysis (Ch. 12). The LAST error exponent equals the outage exponent, which is $d^*(r)$ .

Show Hint

Decompose the error event into channel outage (the MIMO mutual information is below the rate) and lattice error given non-outage (the Erez-Zamir lattice fails on a typical channel).

For non-outage realisations, Erez-Zamir gives sub-exponential lattice-decoding error — negligible compared to the outage.

The channel-outage probability is $\doteq \text{SNR}^{-d^*(r)}$ by Zheng-Tse's Wishart eigenvalue analysis (Ch. 12).

Minkowski-Hlawka averaging over random $\Lambda_{c,M}$ converts the existence argument into a concrete error bound.

Proof

Step 1 — MMSE-GDFE preserves the MIMO mutual information

By Thm. TMMSE-GDFE Preserves Mutual Information, the MMSE-GDFE filter $\mathbf{F}$ is a sufficient statistic for decoding $\mathbf{x}$ from $\mathbf{y}$ . Hence the mutual information between the transmitted codeword and the filtered observation $\mathbf{z}$ equals the full MIMO mutual information $I_{\mathrm{MIMO}}(\mathbf{H}) = T \log_2 \det(\mathbf{I}_{n_t} + \text{SNR} \mathbf{H}^{H} \mathbf{H})$ . The LAST code's error probability on the MMSE-GDFE output equals the error probability on the raw MIMO output — no capacity is lost.

Step 2 — Erez-Zamir on the effective AWGN channel

Conditioned on $\mathbf{H}$ , the MMSE-GDFE output $\mathbf{z} = \mathbf{R} \mathbf{x} + \mathbf{w}_{\text{eff}}$ with $\mathbf{w}_{\text{eff}}$ of covariance $\mathbf{I}_{n_t T}$ is equivalent (via Thm. TLAST Codebook Realises a Nested-Lattice Code over the MIMO Channel) to an Erez-Zamir lattice-AWGN channel with per-dimension SNR proportional to $\prod_i R_{ii}^{2/(n_t T)}$ . By Erez-Zamir (Ch. 16), random nested-lattice codes at rate $R < \log_2 \det(\mathbf{I} + \text{SNR} \mathbf{H}^{H} \mathbf{H})$ achieve arbitrarily small error probability in the limit of large block-length (lattice dimension).

Step 3 — Channel outage event dominates

The LAST error probability decomposes as $\bar{P}_e(\text{SNR}) = \Pr\bigl(\text{outage}(\mathbf{H}, r)\bigr) + \mathbb{E}_{\mathbf{H}}\bigl[\bar{P}_{e|\mathbf{H}} \cdot \mathbf{1}\{\text{not outage}\}\bigr].$ On non-outage channel realisations, Erez-Zamir (Step 2) gives $\bar{P}_{e|\mathbf{H}} \to 0$ sub-exponentially as $\text{SNR} \to \infty$ , so the second term is $o(\text{SNR}^{-c})$ for any $c > 0$ . The first term — the outage probability — is the asymptotic dominant contribution.

Step 4 — Zheng-Tse outage exponent

The Zheng-Tse analysis (Ch. 12) of the outage probability on the i.i.d. Rayleigh channel uses the eigenvalue decomposition $\mathbf{H}^{H} \mathbf{H} = \mathbf{U} \mathrm{diag}(\text{SNR}^{-\alpha_i}) \mathbf{U}^H$ (with $\alpha_i \ge 0$ at high SNR). The outage event becomes $\sum_i (1 - \alpha_i)^+ < r$ , and the Wishart joint density plus Laplace's method gives $\Pr(\text{outage}) \;\doteq\; \text{SNR}^{-d^{*}(r)}, \qquad d^*(r) = (n_t - r)(n_r - r).$

Step 5 — Minkowski-Hlawka averaging

The random-lattice averaging (Minkowski-Hlawka, Ch. 15) turns "for the random $\Lambda_c$ " into "for a random draw, with concentration". Since the Erez-Zamir step is an average over $\Lambda_c$ , the ensemble-averaged error exponent is exactly $d^*(r)$ , and by concentration there exists a deterministic sequence of lattices achieving the same exponent.

Step 6 — Matching Zheng-Tse converse

The Zheng-Tse converse (Ch. 12) states that every code on the i.i.d. Rayleigh MIMO channel with rate $r \log_2 \text{SNR}$ has error exponent at most $d^*(r)$ . Combining with the achievability of Steps 1-5: the LAST code's error exponent equals $d^*(r)$ . $\blacksquare$

,

Operational Interpretation of the Theorem

The theorem tells us something more than "LAST codes are DMT-optimal." It tells us that any lattice with adequate coding gain, when combined with MMSE-GDFE, achieves the DMT. The designer is free to pick the lattice — and the 2008 Kumar-Caire result (§4) says: pick the densest one for the best finite-SNR performance.

To see why this is the operational message, notice that the proof uses only two properties of the lattice ensemble: (i) Minkowski-Hlawka existence, i.e., there are lattices achieving density $\sim 2^{-n}$ (Ch. 15); (ii) the crypto-lemma dither argument, which works for any shaping lattice with round Voronoi regions. These are not restrictive — they are satisfied by $\mathbb{Z}^n$ , $D_n$ , $E_8$ , the Leech lattice, Barnes-Wall, and indeed any "generic" lattice. Therefore the theorem applies immediately to structured LAST codes with inner lattice $E_8$ or Leech — which is what §4 proves formally.

This is a design lesson worth pausing on. The algebraic CDA-NVD proof of Ch. 13 works for one lattice — the one embedded in the cyclic division algebra — and breaks if you try to substitute a different one. The lattice-theoretic LAST proof works for every sufficiently dense lattice. As a designer, you get to optimise the lattice for finite-SNR coding gain without losing DMT-optimality. That is precisely the freedom Kumar and Caire exploit in §4.

Theorem: The DMT Is the Outage Exponent of the MIMO Channel

On the $n_t \times n_r$ i.i.d. Rayleigh block-fading channel of block length $T \ge n_t$ , the outage-probability exponent with respect to rate $R = r \log_2 \text{SNR}$ equals the Zheng-Tse DMT: $\Pr\bigl(I_{\mathrm{MIMO}}(\mathbf{H}) < R \bigr) \;\doteq\; \text{SNR}^{-(n_t - r)(n_r - r)} \;=\; \text{SNR}^{-d^*(r)}.$ In words, the outage-exponent lower bound is tight: the best code achieves exactly the outage probability asymptotically.

This is the converse side of LAST's achievability. It says that no code can do better than outage — the channel is the fundamental bottleneck, and the code's job is only to realise the outage-floor performance. Every DMT-optimal code (Alamouti at $r = 0$ , CDA-NVD, LAST, etc.) realises the same exponent via a different mechanism, but the ceiling is the same. The theorem is why "DMT-optimality" is a universal benchmark and not a code-specific quantity.

Show Hint

Use Zheng-Tse's Wishart eigenvalue parametrisation $\alpha_i = - \log_\text{SNR}(\lambda_i(\mathbf{H}^{H}\mathbf{H}))$ .

Express the outage event as $\sum_i (1 - \alpha_i)^+ \le r$ .

Laplace's method on the joint Wishart density gives the SNR exponent $\sum_i (n_t + n_r - 2 i - 1) \alpha_i^+$ subject to the outage constraint.

Minimise this exponent over the feasible $\{\alpha_i\}$ — the minimum equals $d^*(r) = (n_t-r)(n_r-r)$ .

Proof

Step 1 — Eigenvalue parametrisation

Write $\alpha_i = -\log_\text{SNR}(\lambda_i(\mathbf{H}^{H} \mathbf{H}))$ for $i = 1, \ldots, n_t$ . At high SNR, $\alpha_i \ge 0$ for all $i$ with probability $\to 1$ . The MIMO mutual information becomes $\sum_i \log_2(1 + \text{SNR} \cdot \text{SNR}^{-\alpha_i}) \doteq \sum_i (1 - \alpha_i)^+ \log_2 \text{SNR}$ . Hence outage is $\{\sum_i (1 - \alpha_i)^+ \le r\}$ .

Step 2 — Wishart joint density

The joint density of the eigenvalues $\{\lambda_i\}$ under i.i.d. Rayleigh has the form $f(\lambda) \propto \prod_i \lambda_i^{n_r - n_t} \prod_{i<j} (\lambda_i - \lambda_j)^2 \exp(-\sum_i \lambda_i)$ . In the $\alpha$ -parametrisation and at high SNR, this becomes $\doteq \text{SNR}^{-\sum_i (n_t + n_r - 2 i + 1) \alpha_i / 2}$ up to constants.

Step 3 — Laplace's method

Integrate over the outage region $\{\alpha : \sum_i (1 - \alpha_i)^+ \le r, \alpha_i \ge 0\}$ . By Laplace's method, the dominant contribution comes from minimising the exponent $\sum_i (n_t + n_r - 2 i + 1) \alpha_i$ subject to the outage constraint.

Step 4 — Minimisation

For $r = k$ an integer, the optimum is $\alpha_i = 0$ for $i = 1, \ldots, k$ and $\alpha_i = 1$ for $i = k+1, \ldots, n_t$ . The resulting exponent is $\sum_{i = k+1}^{n_t} (n_t + n_r - 2 i + 1) = (n_t - k)(n_r - k)$ . For non-integer $r$ , linear interpolation applies. Hence $\Pr(\text{outage}) \doteq \text{SNR}^{-(n_t - r)(n_r - r)}$ . $\blacksquare$

,

Example: DMT Curves for $2 \times 2$ , $2 \times 4$ , and $4 \times 4$ MIMO

Tabulate the Zheng-Tse DMT curve $d^*(r) = (n_t - r)(n_r - r)$ at integer multiplexing gains $r = 0, 1, \ldots, \min(n_t, n_r)$ for the following configurations and sketch the piecewise-linear curves: (a) $2 \times 2$ ; (b) $2 \times 4$ ; (c) $4 \times 4$ . For a LAST code at $r = 1$ , what diversity order does it achieve in each case?

Solution

Part (a): 2x2

$d^*(0) = 4, d^*(1) = 1, d^*(2) = 0$ . At $r = 1$ the diversity is $1$ — a single-antenna-equivalent diversity order, even though we have 2 transmit and 2 receive antennas. This is the classical Zheng-Tse surprise: at the maximum multiplexing gain $\min(n_t, n_r) - 1$ there is essentially no diversity advantage.

Part (b): 2x4

$d^*(0) = 8, d^*(1) = 3, d^*(2) = 0$ . The extra receive antennas help at every $r$ : at $r = 1$ the diversity is $3$ (vs. $1$ for $2 \times 2$ ), reflecting the receive-diversity gain encoded in the $(n_t - r)(n_r - r)$ product.

Part (c): 4x4

$d^*(0) = 16, d^*(1) = 9, d^*(2) = 4, d^*(3) = 1, d^*(4) = 0$ . At $r = 2$ (half the maximum multiplexing) a $4 \times 4$ achieves diversity $4$ — the same as a $2 \times 2$ at $r = 0$ . A LAST code at $r = 1$ on $4 \times 4$ achieves diversity $9$ . $\blacksquare$

,

Common Mistake: DMT Is an Asymptotic Benchmark — Do Not Over-Interpret at Finite SNR

Mistake:

A reader may conclude that a DMT-optimal code (CDA-NVD, LAST, or any other) automatically outperforms a non-DMT-optimal code (plain V-BLAST, zero-forcing) at every SNR and every rate. In particular, one might expect a structured LAST code at $r = 2$ on a $4 \times 4$ channel to outperform ZF-V-BLAST at any operating SNR.

Correction:

DMT is an asymptotic ( $\text{SNR} \to \infty$ ) benchmark. At finite SNR, the coding gain — the shift of the error- probability-vs-SNR curve — matters more than the slope (the DMT). A structured LAST code with $E_8$ inner lattice may have a $3$ dB coding-gain advantage over ZF-V-BLAST but only start showing this advantage at $\text{BER} \le 10^{-3}$ . Below that SNR, ZF-V-BLAST (which is simpler and has lower decoder complexity) may be preferable. The DMT tells you the slope of the log-BER vs. log-SNR curve at asymptotic SNR; the coding gain tells you the intercept. Both matter in deployment, and a good design optimises both. This is exactly the reason the 2008 Kumar-Caire paper (§4) was needed: the 2004 LAST paper gave DMT-optimality (the slope), but structured LAST gives coding gain (the intercept). Together they are the full story.

,

DMT Optimality of LAST Codes