The LAST Code Construction

From Algebra to Information Theory: Why We Need LAST Codes

Chapter 13 closed the DMT question for linear space-time codes algebraically: pick a cyclic division algebra of degree ntn_t, verify the non-vanishing-determinant property, and you have a code that achieves the Zheng-Tse curve dβˆ—(r)=(ntβˆ’r)(nrβˆ’r)d^*(r) = (n_t - r)(n_r - r) for every (nt,nr,r)(n_t, n_r, r). The point is that this proof rests on the minimum-determinant Ξ΄min⁑β‰₯Ξ΄0\delta_{\min} \ge \delta_0 of the codebook β€” a number-theoretic invariant of the CDA. It works, but it does not tell us why lattice codes are DMT-optimal; it tells us that one particular lattice (the image of the CDA's regular representation) is.

Here we take the complementary route. We ask: is there a construction that achieves the DMT using only information-theoretic ingredients β€” a random lattice (Ch. 15 Minkowski-Hlawka), an MMSE-type pre-processor (Ch. 16 Erez-Zamir), and a union bound? The answer, due to El Gamal, Caire, and Damen (2004), is yes: the Lattice Space-Time (LAST) code family. The construction is a direct composition of two prior chapters.

From Ch. 16 (Erez-Zamir) we take the recipe for achieving the AWGN capacity: a fine lattice Ξ›c\Lambda_c inside a coarse shaping lattice Ξ›s\Lambda_s, common random dithering, and MMSE scaling to defeat the effective noise. From Ch. 10-13 (MIMO) we take the idea that a MIMO fading channel can be triangularised by a decision-feedback receiver. The LAST construction does both: it sends Erez-Zamir codewords over the MIMO channel, triangularises the channel with the MMSE-GDFE receiver (Β§2), and β€” the main theorem of Β§3 β€” achieves the Zheng-Tse DMT at every rate.

Intuitively, what happens is that the MMSE-GDFE is to lattice codes what MMSE-SIC is to Gaussian random codes: it is the receiver that makes the capacity-achieving codebook work on a multi-layer channel. The analogy is precise and will be the central thread of Β§Β§2-3.

The construction was immediately extended (Β§4) to structured LAST codes by K. R. Kumar and G. Caire (2008), who showed that replacing the random lattice by E8E_8, the Leech lattice, or any other dense known lattice preserves DMT-optimality AND gives a 3-6 dB coding-gain boost at moderate SNR. This practically-oriented extension is the bridge between information theory and deployment.

,

Definition:

LAST Codebook

A Lattice Space-Time (LAST) codebook of rate RR bits per channel use, block length TT, and transmit dimension ntn_t is defined by a pair of lattices Ξ›cβŠ‡Ξ›s\Lambda_c \supseteq \Lambda_s in R2ntT\mathbb{R}^{2 n_t T} (real dimensions; CntT\mathbb{C}^{n_t T} is identified with R2ntT\mathbb{R}^{2 n_t T} via the standard complex-to-real isomorphism). The codebook is Cβ€…β€Š=β€…β€ŠΞ›c∩V(Ξ›s)β€…β€Š=β€…β€Š{xβˆˆΞ›c:x∈V(Ξ›s)},\mathcal{C} \;=\; \Lambda_c \cap \mathcal{V}(\Lambda_s) \;=\; \{\mathbf{x} \in \Lambda_c : \mathbf{x} \in \mathcal{V}(\Lambda_s)\}, where V(Ξ›s)\mathcal{V}(\Lambda_s) is the Voronoi region of the coarse (shaping) lattice Ξ›s\Lambda_s, and Ξ›c\Lambda_c is the fine (coding) lattice. The rate is R=Tβˆ’1log⁑2βˆ£Ξ›c/Ξ›s∣=Tβˆ’1log⁑2(V(Ξ›s)/V(Ξ›c))R = T^{-1} \log_2 |\Lambda_c / \Lambda_s| = T^{-1} \log_2 (V(\Lambda_s) / V(\Lambda_c)) bits per channel use.

Given message u∈Z2ntT\mathbf{u} \in \mathbb{Z}^{2 n_t T} and a common random dither d∼Unif(V(Ξ›s))\mathbf{d} \sim \mathrm{Unif}(\mathcal{V}(\Lambda_s)) known at both transmitter and receiver, the transmitted codeword matrix is xβ€…β€Š=β€…β€Š[Gu+d]β€Šmodβ€ŠΞ›sβ€…β€Šβˆˆβ€…β€ŠV(Ξ›s),\mathbf{x} \;=\; [\mathbf{G}\mathbf{u} + \mathbf{d}] \bmod \Lambda_s \;\in\; \mathcal{V}(\Lambda_s), reshaped into a space-time matrix X∈CntΓ—T\mathbf{X} \in \mathbb{C}^{n_t \times T}. The per-channel-use transmit energy constraint is E[βˆ₯Xβˆ₯F2/T]=Οƒs2(Ξ›s)\mathbb{E}[\|\mathbf{X}\|_F^2 / T] = \sigma_s^2(\Lambda_s), the second moment of the shaping Voronoi region; choosing Ξ›s\Lambda_s so that Οƒs2=Es\sigma_s^2 = E_s (the transmit energy budget) meets the power constraint.

Three design knobs define a LAST code: (i) the fine lattice Ξ›c\Lambda_c, which controls coding gain (denser Ξ›c\Lambda_c = better dmin⁑d_{\min}); (ii) the coarse lattice Ξ›s\Lambda_s, which controls shaping gain (rounder V(Ξ›s)\mathcal{V}(\Lambda_s) = closer to capacity-achieving Gaussian signalling); and (iii) the common random dither d\mathbf{d}, which makes the transmit signal uniform on V(Ξ›s)\mathcal{V}(\Lambda_s) β€” the lattice-code analog of i.i.d. random Gaussian coding. The dither is what turns a deterministic codebook into one whose error-probability analysis is tractable via averaging β€” directly analogous to random-coding achievability proofs.

,

Definition:

LAST Transmission over a MIMO Block-Fading Channel

Let the MIMO channel be \ntnYβ€…β€Š=β€…β€ŠHX+w,\ntn{Y} \;=\; \mathbf{H} \mathbf{X} + \mathbf{w}, where H∈CnrΓ—nt\mathbf{H} \in \mathbb{C}^{n_r \times n_t} is the channel matrix (constant over the block of length TT), w∈CnrΓ—T\mathbf{w} \in \mathbb{C}^{n_r \times T} has i.i.d. CN(0,1)\mathcal{CN}(0, 1) entries, and the transmit constraint is tr(E[XXH])/T≀ntEs\mathrm{tr}(\mathbb{E}[\mathbf{X} \mathbf{X}^H])/T \le n_t E_s. The SNR is SNR=Es\text{SNR} = E_s (equivalently: per-symbol energy equal to SNR times noise variance, which is normalised to 11).

LAST transmission. Vectorise the codeword matrix by stacking columns: vec(X)∈CntT\mathrm{vec}(\mathbf{X}) \in \mathbb{C}^{n_t T}. Using the Kronecker identity vec(HX)=(ITβŠ—H)vec(X)\mathrm{vec}(\mathbf{H} \mathbf{X}) = (\mathbf{I}_T \otimes \mathbf{H}) \mathrm{vec}(\mathbf{X}), the block observation becomes vec(\ntnY)β€…β€Š=β€…β€Š(ITβŠ—H)vec(X)+vec(w),\mathrm{vec}(\ntn{Y}) \;=\; (\mathbf{I}_T \otimes \mathbf{H}) \mathrm{vec}(\mathbf{X}) + \mathrm{vec}(\mathbf{w}), i.e., an effective flat (ntT)β†’(nrT)(n_t T) \to (n_r T) linear channel H~=ITβŠ—H\tilde{\mathbf{H}} = \mathbf{I}_T \otimes \mathbf{H} of block length 11 over R2ntT\mathbb{R}^{2 n_t T}-space. LAST decoding operates on this vectorised representation, treating the entire block as a single lattice-encoded super-symbol.

The key modelling move is the vectorisation: after stacking, the MIMO-over-time problem looks like a flat lattice-channel problem of enlarged dimension ntT→nrTn_t T \to n_r T. The fine lattice Λc\Lambda_c lives in this enlarged space. This is why TT appears as a design parameter: larger TT = larger lattice dimension = more opportunity for shaping and coding gain, but also exponentially heavier sphere- decoding complexity.

,

Structure of a LAST Codebook (2D Slice)

A two-dimensional slice through the ntTn_t T-dimensional LAST codebook. Fine lattice points of Ξ›c\Lambda_c are shown as small dots; the Voronoi region of the coarse lattice Ξ›s\Lambda_s is shaded. The codebook consists of fine-lattice points inside the shaded region. Increasing ntn_t or TT increases the dimension of Ξ›c\Lambda_c (and therefore the number of shaped codewords inside V(Ξ›s)\mathcal{V}(\Lambda_s)). The visualisation projects the high-dimensional lattice pair onto a 2D subspace for illustration; the actual construction lives in R2ntT\mathbb{R}^{2 n_t T}.

Parameters
2
2
πŸŽ“CommIT Contribution(2004)

Lattice Space-Time Codes Achieving the Optimal DMT

H. El Gamal, G. Caire, M. O. Damen β€” IEEE Trans. Inf. Theory, vol. 50, no. 6, pp. 968–985

Hesham El Gamal (Ohio State), Giuseppe Caire (then at Politecnico di Torino, later USC, now TU Berlin), and Mohamed Oussama Damen (Ecole Nationale SupΓ©rieure des TΓ©lΓ©communications de Paris, later Waterloo) introduced the Lattice Space-Time (LAST) code family and gave the first information-theoretic proof that lattice codes are diversity-multiplexing-tradeoff (DMT) optimal on MIMO fading channels. The paper has become one of the foundational references of lattice-based space-time coding.

The construction. Before this paper, MIMO codes were either algebraic (Alamouti, OSTBC, CDA-NVD codes of Ch. 13) or ad-hoc linear-dispersion codes (Hassibi-Hochwald). El Gamal, Caire, and Damen proposed a new family: take a random fine lattice Ξ›c\Lambda_c (in the sense of Minkowski-Hlawka, Ch. 15), a shaping lattice Ξ›sβŠ‚Ξ›c\Lambda_s \subset \Lambda_c, and a common random dither d\mathbf{d}; transmit shaped codewords x=[Gu+d]β€Šmodβ€ŠΞ›s\mathbf{x} = [\mathbf{G} \mathbf{u} + \mathbf{d}] \bmod \Lambda_s over the vectorised MIMO channel. The codebook has no algebraic structure β€” only geometric (lattice) structure.

The receiver: MMSE-GDFE. The key insight of the paper is the receiver. The MMSE-SIC receiver of V-BLAST (Wolniansky et al. 1998) achieves the MIMO sum capacity for Gaussian random codebooks by triangularising the channel via a decision-feedback structure. El Gamal, Caire, and Damen identified the lattice analog: the MMSE-GDFE (Minimum-Mean-Square-Error Generalised Decision Feedback Equaliser). It is derived from the QR decomposition of the augmented channel matrix [HT,Ξ±I]T[\mathbf{H}^{T}, \sqrt{\alpha} \mathbf{I}]^T with Ξ±=1/SNR\alpha = 1/\text{SNR}, and it converts the MIMO channel into ntn_t parallel triangular layers, each of which can be lattice- decoded in turn. The analogy MMSE-GDFE : lattice codes :: MMSE-SIC : Gaussian codes is precise and is developed in Β§2.

The main theorem. The paper's main result is that the LAST codebook C=Ξ›c∩V(Ξ›s)\mathcal{C} = \Lambda_c \cap \mathcal{V}(\Lambda_s) decoded with MMSE-GDFE lattice decoding achieves the full Zheng-Tse DMT curve dβˆ—(r)=(ntβˆ’r)(nrβˆ’r)d^*(r) = (n_t - r)(n_r - r) for every (nt,nr)(n_t, n_r) and every r∈[0,min⁑(nt,nr)]r \in [0, \min(n_t, n_r)]. The proof has three ingredients: (i) averaging over random lattices (Minkowski-Hlawka existence); (ii) Erez-Zamir analysis of the effective AWGN channel after MMSE-GDFE; (iii) matching the Laplace-method Wishart outage exponent of Zheng-Tse. The theorem is reproduced and proved in full in Β§3.

Why it redefined the field. Before 2004, DMT optimality was known only via algebraic CDA constructions (Ch. 13). Those constructions require deep number theory (cyclic division algebras, non-norm conditions, Selmer groups) and are inflexible: the codebook dimension is fixed by the algebra. The LAST paper showed that any lattice β€” random, E8E_8, Leech, Barnes-Wall β€” wrapped in the MMSE-GDFE receiver achieves the DMT. This decoupled code-design from algebra: you can pick the lattice to match your finite-SNR coding-gain target (which is what Kumar and Caire 2008, below, did with E8E_8 and Leech) and still have DMT optimality for free. It also established the decoder as the load-bearing component: it is not the codebook alone but the codebook + MMSE-GDFE combination that achieves the DMT. This mindset β€” code+receiver co-design β€” has been foundational to every subsequent lattice-STC paper, including IR-LAST (Ch. 14), compute-and-forward (Ch. 18), and integer-forcing linear receivers.

lastdmtmmse-gdfelattice-decodingmimoView Paper β†’

Theorem: LAST Codebook Realises a Nested-Lattice Code over the MIMO Channel

The LAST codebook C=Ξ›c∩V(Ξ›s)\mathcal{C} = \Lambda_c \cap \mathcal{V}(\Lambda_s) with common random dither d\mathbf{d}, transmitted over the vectorised MIMO channel vec(\ntnY)=(ITβŠ—H)vec(X)+vec(w)\mathrm{vec}(\ntn{Y}) = (\mathbf{I}_T \otimes \mathbf{H}) \mathrm{vec}(\mathbf{X}) + \mathrm{vec}(\mathbf{w}), is statistically equivalent to an Erez-Zamir nested-lattice code transmitted over an AWGN channel with channel matrix ITβŠ—H\mathbf{I}_T \otimes \mathbf{H} and noise CN(0,InrT)\mathcal{CN}(0, \mathbf{I}_{n_r T}), in the sense that for every realisation of the dither d\mathbf{d}, the modulo-lattice operation [β‹…]β€Šmodβ€ŠΞ›s[\cdot] \bmod \Lambda_s converts the space-time channel into the same equivalent AWGN channel that Erez-Zamir analyse.

The point is that dithering + modulo reduction removes all codeword-structure from the transmitted signal: what the channel sees is a uniform distribution on V(Ξ›s)\mathcal{V}(\Lambda_s), which is isotropic in R2ntT\mathbb{R}^{2 n_t T} when Ξ›s\Lambda_s is Voronoi- rounded. This is the lattice analog of Gaussian-random-code achievability: by randomising the codebook across dither realisations, we get ensemble-averaged error probabilities that match the AWGN setting.

,

Example: Computing Rate and Dimension for a 2Γ—22 \times 2 LAST Code

Consider a 2Γ—22 \times 2 MIMO channel (nt=nr=2n_t = n_r = 2) with block length T=4T = 4. Let Ξ›c\Lambda_c be a random lattice in R16\mathbb{R}^{16} (after complex-to-real conversion) with fundamental volume V(Ξ›c)=1V(\Lambda_c) = 1, and let Ξ›s=2kZ16\Lambda_s = 2^k \mathbb{Z}^{16} for some integer kk. Compute: (a) The cardinality ∣C∣|\mathcal{C}| of the LAST codebook in terms of kk. (b) The per-channel-use rate RR in bits/ch.use. (c) The block-wise rate of information symbols in bits. (d) The total multiplexing gain rr as a function of the SNR-scaling of kk.

LAST vs. CDA: Two Routes to the Same DMT Peak

It is worth pausing to place LAST codes next to the CDA-NVD codes of Ch. 13. Both constructions achieve the Zheng-Tse DMT curve, but they do so through radically different mathematical machinery.

  • CDA-NVD codes (Ch. 13). The proof uses the non-vanishing determinant property of cyclic division algebras β€” a number-theoretic statement about the algebraic integers of a cyclic Galois extension. The code is explicit (a CDA generator matrix) and deterministic (no random dither needed). Coding gain is an algebraic constant depending on the choice of non-norm element Ξ³\gamma and the class of the ideal.

  • LAST codes (this chapter). The proof uses the Minkowski- Hlawka averaging over random lattices β€” an information- theoretic statement about the generic dense lattice. The code is non-explicit (existence proof) and randomised (common random dither). Coding gain is a constant depending on the second moment of the shaping Voronoi region and the density of the random fine lattice.

The two constructions are complementary, not competing. CDA codes are the best explicit codes for moderate ntn_t (say, nt≀4n_t \le 4) where the number-theoretic machinery is manageable and the resulting sphere decoder is tractable. LAST codes are the information- theoretic baseline that tells us why MIMO lattice codes can achieve the DMT β€” and, crucially, LAST gives us the freedom to substitute any well-known dense lattice (E8E_8, Leech, BW_16) for the fine lattice, which is the path Kumar and Caire (2008) took in Β§4. For ntβ‰₯5n_t \ge 5 or large ntTn_t T products, structured LAST is competitive with or exceeds CDA in finite-SNR performance.

The underlying lesson is the same one we have seen repeatedly: if you have two proofs of the same theorem, each proof offers a different design freedom. Use both.

,

Common Mistake: Random LAST Is an Existence Proof, Not a Deployable Code

Mistake:

A reading of the 2004 LAST paper might suggest that any random lattice, once wrapped in MMSE-GDFE, gives a DMT-optimal, practical space-time code. One might then proceed to sample a "random lattice from the Minkowski-Hlawka ensemble" and try to implement it.

Correction:

The Minkowski-Hlawka existence theorem (Ch. 15) is non-constructive: it averages over the space of lattices with a Haar-like measure and says that a lattice achieving the density bound exists β€” but does not tell us which lattice does. Sampling from the ensemble requires high-precision arithmetic, is non-reproducible, and does not yield the best finite-SNR coding gain. For practical deployment one must use an explicit dense lattice β€” which is why the 2008 Kumar-Caire paper (Β§4) proposed E8E_8, Leech, and Barnes-Wall lattices. The 2004 existence result is valuable because it tells us what the information-theoretic baseline is; the 2008 constructive result tells us how to deploy a practical code.

,