Ferkans — Interactive Telecom Tutor

Refining the DMT: When the Theorem's Hypotheses Break

The Zheng-Tse theorem assumes two things we have glossed over: (i) block length $L \ge n_t + n_r - 1$ (so that the error-matrix product $\boldsymbol{\Delta}\boldsymbol{\Delta}^H$ is full-rank under Gaussian random coding), and (ii) i.i.d. Rayleigh fading (so that the eigenvalues of $\mathbf{H}\mathbf{H}^{H}$ follow the unbiased Wishart distribution). In real systems both assumptions often fail:

Block length. 5G NR slot duration is $14$ OFDM symbols at sub-6 GHz, $7$ at mmWave. On a fast-fading channel the coherence time may be as short as a handful of OFDM symbols, so typical STC block lengths are $L = 2$ – $8$ per fading block — often below the $n_t + n_r - 1$ threshold for even moderate antenna counts ( $n_t + n_r - 1 = 7$ for $4 \times 4$ ).
Correlation. Real MIMO channels have substantial spatial correlation — between transmit antennas due to proximity, between receive antennas due to angular spread limits, and cross-correlation through scatterer geometry. A typical 5G Tx correlation coefficient $\rho$ is $0.3$ – $0.7$ for antenna spacing $\le \lambda/2$ .

This section states and proves two refinements:

Block-length truncation. For $L < n_t + n_r - 1$ , the DMT curve is truncated at $r = L$ : beyond that, no code can achieve positive diversity.
Correlation invariance. For spatially correlated Rayleigh fading with full-rank correlation matrix, the DMT exponent is unchanged: $d^*(r)$ remains the Zheng-Tse curve. Correlation affects only the coding gain (multiplicative constant), not the exponent.

The second result is striking: correlation is a coding-gain penalty, not a DMT penalty. A code that is DMT-optimal on i.i.d. Rayleigh remains DMT-optimal on correlated Rayleigh (as long as $\det \mathbf{R}_t > 0$ ), losing only a constant factor $\det(\mathbf{R}_t)^{-1/n_t}$ in BER.

,

Theorem: DMT Truncation for Short Block Length

Consider an $n_t \times n_r$ i.i.d. Rayleigh MIMO channel with coherent detection and space-time codeword block length $L$ . If $L \ge n_t + n_r - 1$ , the DMT curve is the full Zheng-Tse curve of Thm. TZheng-Tse Diversity-Multiplexing Tradeoff. If $L < n_t + n_r - 1$ , the DMT is truncated beyond the $L$ -th corner: $d^*(r) \;=\; \begin{cases} (n_t - r)(n_r - r) & r \in [0, L] \\ 0 & r > L \end{cases}$ (piecewise-linear as before, between integer corners). In particular, for $L < n_t + n_r - 1$ , multiplexing gains $r > L$ are not achievable with positive diversity.

The practical consequence: short block lengths (= short coherence time) impose a multiplexing ceiling $r \le L$ , independent of antenna count. A $4 \times 4$ channel with $L = 2$ has $r_{\max}^{\rm eff} = 2$ — the channel cannot support more than $2$ streams reliably per block, even though the antenna counts support $4$ .

The Zheng-Tse proof uses Gaussian random codewords of length $L$ . For the error matrix $\boldsymbol{\Delta} = \ntn{X} - \hat{\ntn{X}}$ to be full-rank (rank $n_t$ ), we need $L \ge n_t$ — but to fully exploit the $n_r \times n_t$ Wishart eigenvalue pool we actually need $L \ge n_t + n_r - 1$ , which ensures that the joint rate function in the proof's LP is unconstrained. For shorter $L$ the LP acquires an extra constraint $\sum_i (1 - \alpha_i)^+ \le L$ , which caps the achievable multiplexing gain at $L$ .

Show Hint

Rank of $\boldsymbol{\Delta}\boldsymbol{\Delta}^H$ is $\min(n_t, L)$ .

For $L < n_t$ , the Zheng-Tse LP has an active constraint and the optimum is cut off at $r = L$ .

Zheng-Tse Thm. 3 formalises this via a careful analysis of the rank-deficient codeword ensemble.

Proof

Rank constraint on codewords

A space-time codeword matrix is $\ntn{X} \in \mathbb{C}^{n_t \times L}$ . If $L < n_t$ , then $\mathrm{rank}(\ntn{X}) \le L < n_t$ , so the error matrix $\boldsymbol{\Delta}$ can have rank at most $L$ . The rank- $L$ constraint limits the number of independent streams the code can transmit per block to $L$ .

LP with block-length constraint

In the Zheng-Tse eigenvalue-exponent LP of Thm. TZheng-Tse Diversity-Multiplexing Tradeoff, the relevant eigenvalues are those of $\boldsymbol{\Delta}\boldsymbol{\Delta}^H$ — which has rank $\min (n_t, L)$ . For $L < n_t$ , only the $L$ smallest-exponent eigenvalues contribute to the outage event; the extra $n_t - L$ directions cannot carry any rate. The LP rate-function becomes $\inf_{\boldsymbol{\alpha}: \sum_i (1 - \alpha_i)^+ < r,\, \alpha_{L+1} = \cdots = \alpha_{n_t} = 0} \sum_i (2i - 1 + n_r - n_t) \alpha_i,$ and the constraint $\sum_{i=1}^L (1 - \alpha_i)^+ < r$ caps $r$ at $L$ .

Truncation formula

For $r \in [0, L]$ , the LP optimum is the same as in the unconstrained case (Zheng-Tse): $d^*(r) = (n_t - r)(n_r - r)$ at integer corners. For $r > L$ , the LP is infeasible (cannot achieve positive diversity with rate above $L$ per block). The DMT is therefore truncated at $r = L$ .

Special case $L = 1$ . Single-channel-use codes (no time dimension) can support only $r = 1$ multiplexing. The DMT curve is $d^*(r) = (n_t - r)(n_r - r)$ for $r \in [0, 1]$ and $0$ otherwise — a two-corner curve regardless of how many antennas.

Special case $L \ge n_t + n_r - 1$ . The full Zheng-Tse curve is recovered. This is the "thick codeword" regime — Chapter 13's CDA codes need $L = n_t$ (just enough to support rank- $n_t$ error matrices with the algebraic structure), and the extra "wiggle room" up to $n_t + n_r - 1$ is for Gaussian random codes to achieve the exponent with probability $1 - o(1)$ . $\blacksquare$

,

DMT Truncation as Block Length $L$ Varies

The DMT curve $d^*(r)$ for a $4 \times 4$ channel as block length $L$ varies from $1$ to $n_t + n_r - 1 = 7$ . At $L = 7$ the full Zheng-Tse curve is recovered; at $L = 1$ the curve is truncated to the segment $r \in [0, 1]$ . Intermediate $L$ values give truncated curves at $r = L$ . Practical coherence-time constraints ( $L = 2$ – $8$ for 5G NR subcarriers, or $L = 1$ for non-coherent STC) directly read off this plot.

Parameters

n_t

4

n_r

4

Block length

L

4

Theorem: DMT Invariance under Full-Rank Tx Correlation

Consider an $n_t \times n_r$ correlated Rayleigh MIMO channel $\mathbf{H} = \mathbf{H}_{w} \mathbf{R}_t^{1/2}$ , where $\mathbf{H}_{w}$ has i.i.d. $\mathcal{CN}(0, 1)$ entries and $\mathbf{R}_t$ is the Hermitian positive-definite transmit correlation matrix with $\det \mathbf{R}_t > 0$ . Assume $L \ge n_t + n_r - 1$ for full-rank codeword ensembles.

The DMT curve of the correlated channel is identical to the i.i.d. case: $d^*_{\rm corr}(r) \;=\; (n_t - r)(n_r - r) \quad \text{at integer } r \in \{0, 1, \ldots, \min(n_t, n_r)\},$ with piecewise-linear interpolation. Full-rank Tx correlation does not change the DMT exponent; only the coding gain is affected by the factor $\det(\mathbf{R}_t)^{-1/n_t}$ (rate- $r$ outage probability multiplicatively scaled, invisibly to $\doteq$ ).

$\det(\mathbf{R}_t) > 0$ means $\mathbf{R}_t$ is non-degenerate — every transmit direction carries some fading. The eigenvalues of $\mathbf{H}\mathbf{H}^{H} = \mathbf{H}_{w} \mathbf{R}_t \mathbf{H}_{w}^{H}$ are those of $\mathbf{R}_t^{1/2} \mathbf{H}_{w}^{H} \mathbf{H}_{w} \mathbf{R}_t^{1/2}$ , which is a scaled Wishart. Scaling the eigenvalues by the fixed factors (eigenvalues of $\mathbf{R}_t$ ) changes the outage threshold multiplicatively but doesn't change the exponent: the rate-function in the large-deviations computation picks up only a constant shift, which is lost under $\doteq$ .

Show Hint

Reduce correlated channel to i.i.d. channel via SVD: $\mathbf{H} = \mathbf{H}_{w} \mathbf{R}_t^{1/2}$ and eigenvalues of $\mathbf{H}\mathbf{H}^{H}$ are those of $\mathbf{H}_{w} \mathbf{R}_t \mathbf{H}_{w}^{H}$ .

Since $\mathbf{R}_t$ is full-rank, its eigenvalues are bounded above and below by positive constants — which disappear under $\doteq$ .

The LP from the Zheng-Tse proof is invariant under multiplicative scaling of the eigenvalues, hence under full-rank correlation.

Proof

Eigenvalue scaling lemma

Let $\mathbf{R}_t = \mathbf{U}_R \mathbf{D}_R \mathbf{U}_R^H$ be the eigendecomposition with diagonal $\mathbf{D}_R = \mathrm{diag}(\rho_1, \ldots, \rho_{n_t})$ and $\rho_i > 0$ (full-rank). Then $\mathbf{H}\mathbf{H}^{H} = \mathbf{H}_{w} \mathbf{R}_t \mathbf{H}_{w}^{H}$ has the same eigenvalues as $\mathbf{D}_R^{1/2} \mathbf{H}_{w}^{H} \mathbf{H}_{w} \mathbf{D}_R^{1/2}$ . Call these eigenvalues $\tilde\lambda_i$ ; they are bounded: $\rho_{\min}(\mathbf{R}_t) \lambda_i(\mathbf{H}_{w}\mathbf{H}_{w}^{H}) \le \tilde\lambda_i \le \rho_{\max}(\mathbf{R}_t) \lambda_i(\mathbf{H}_{w}\mathbf{H}_{w}^{H}).$

Exponent invariance

Taking $\tilde\lambda_i = \text{SNR}^{-\tilde\alpha_i}$ , the above gives $\tilde\alpha_i = \alpha_i^{\rm iid} + O(1/\log\text{SNR})$ : the exponents differ by an additive $O(1/\log\text{SNR})$ term, which vanishes in the limit defining $\doteq$ . The outage event $\sum_i (1 - \tilde\alpha_i)^+ < r$ reduces to the same LP constraint as in the i.i.d. case, to leading exponential order.

Outage exponent unchanged

The Zheng-Tse LP computation gives the same exponent $d^*(r) = (n_t - r)(n_r - r)$ at integer corners. The coding gain (multiplicative constant in front of $\text{SNR}^{-d^*(r)}$ ) picks up a factor $(\det \mathbf{R}_t)^{-1/n_t}$ , which is $\le 1$ (worse than i.i.d.) by Hadamard's inequality (equality iff $\mathbf{R}_t = \mathbf{I}$ ). This constant gap is invisible to $\doteq$ . $\blacksquare$

,

DMT under Tx Spatial Correlation

The DMT curve $d^*(r)$ for a $2 \times 2$ MIMO channel with exponential transmit correlation matrix $\mathbf{R}_t = [[1, \rho], [\rho, 1]]$ , as $\rho$ varies from $0$ (i.i.d.) to $0.95$ (nearly rank-deficient). The curve is invariant for any $\rho \in [0, 1)$ (full-rank regime): full-rank correlation is a coding-gain penalty, not a DMT penalty. The outage probability at a fixed SNR is plotted in a secondary panel to show the coding-gain degradation as a function of $\rho$ — a constant offset that depends on $\det(\mathbf{R}_t) = 1 - \rho^2$ . At $\rho = 1$ the correlation becomes rank-deficient and the DMT does degrade (not shown, since the slider stops at $0.95$ ).

Parameters

n_t

2

n_r

2

Tx correlation

\rho

0.7

⚠️Engineering Note

Coherence Time and DMT Block Length in 5G NR

The "block length" $L$ in the DMT theorem is the number of channel uses over which the fading is constant — i.e., the coherence time measured in channel uses. For 5G NR OFDM systems:

Sub-6 GHz, pedestrian users ( $3$ km/h at $3.5$ GHz carrier): coherence time $\sim 30$ ms, which at 14 OFDM symbols per 1-ms slot is $\sim 400$ OFDM symbols. Across the 273 allocated subcarriers this is enormous — effectively infinite $L$ . DMT fully applies.
Sub-6 GHz, vehicular users ( $100$ km/h at $3.5$ GHz): coherence time $\sim 1$ ms, so $\sim 14$ OFDM symbols. Still $L \ge 7 = n_t + n_r - 1$ for $4 \times 4$ ; DMT applies.
mmWave (28 GHz), vehicular users ( $100$ km/h at $28$ GHz): coherence time $\sim 100$ μs, or $\sim 1.4$ OFDM symbols at 7 symbols/ slot. This is too short for full-DMT operation of a $4 \times 4$ code; the truncation theorem bites.
Wi-Fi 7 indoor ( $1$ m/s at $5$ GHz): coherence time $\sim 30$ ms, $\sim 1500$ OFDM symbols. DMT fully applies.

Operational implication. For mmWave high-mobility scenarios (V2X, drones), the effective $r_{\max}$ is capped by the coherence-time $L$ , not by the antenna count. This is one driver for non-coherent space-time codes in high-mobility mmWave links — when $L$ is so short that even training overhead would kill the DMT, the code itself must work without channel estimation.

Practical Constraints

•
5G NR OFDM symbol duration: $71.4$ μs (15 kHz sub-6 GHz), $8.9$ μs (120 kHz mmWave).
•
Coherence time at 100 km/h: $\sim 1$ ms at 3.5 GHz, $\sim 100$ μs at 28 GHz.
•
$n_t + n_r - 1$ threshold: $7$ for $4 \times 4$ , $15$ for $8 \times 8$ — sometimes exceeds mmWave coherence time.

📋 Ref: 3GPP TS 38.211 (physical channels and modulation), §4.3 (frame structure)

🔧Engineering Note

Measured Tx Correlation in LTE/5G: $\rho \in [0.3, 0.7]$

Measured Tx correlation coefficients in 3GPP urban microcell scenarios (UMi) range from $\rho = 0.3$ – $0.7$ for typical $\lambda/2$ -spaced antenna arrays at the base station. At the UE, smaller form factors and mutual coupling yield $\rho = 0.5$ – $0.9$ on handset devices.

What the DMT theorem says about this. For any $\rho < 1$ (full- rank correlation), the DMT exponent is unchanged from the i.i.d. Rayleigh case. This is reassuring: real channels that are not i.i.d. do not lose their fundamental multiplexing structure. The coding gain, however, does degrade by $(\det \mathbf{R}_t)^{-1/n_t}$ , which for $\mathbf{R}_t = \mathrm{Toeplitz}([1, \rho, \rho^2, \ldots])$ with $\rho = 0.5$ gives $\det^{-1/n_t} \sim 1.2$ on $2 \times 2$ (a $\sim 1$ dB loss) and $\sim 1.8$ on $4 \times 4$ ( $\sim 2.5$ dB).

Design implication. A 5G base station with closely-spaced antennas should expect $1$ – $3$ dB coding-gain loss vs i.i.d. Rayleigh, but the DMT exponent is robust. Spatial separation (e.g., polarisation diversity, distributed antennas) reduces $\rho$ and recovers the coding gain — which is why massive MIMO arrays use non-adjacent element positioning and mixed polarisations.

Practical Constraints

•
3GPP UMi high-correlation scenario: $\rho = 0.6$ to $0.9$ per 38.901 spatial channel model.
•
Coding-gain penalty from full-rank correlation: $\sim 1$ – $3$ dB.
•
DMT exponent unchanged for any $\det(\mathbf{R}_t) > 0$ .

📋 Ref: 3GPP TR 38.901 (channel model), §7.5

Common Mistake: Rank-Deficient Correlation Does Break DMT

Mistake:

Concluding from Thm. TDMT Invariance under Full-Rank Tx Correlation that "correlation doesn't matter for DMT" — in particular for rank-deficient correlation matrices ( $\det \mathbf{R}_t = 0$ ).

Correction:

The theorem requires $\det \mathbf{R}_t > 0$ (full-rank). For rank- deficient correlation, the DMT does degrade. Specifically, if $\mathrm{rank}(\mathbf{R}_t) = k < n_t$ , then only $k$ transmit directions are active and the effective MIMO channel is $k \times n_r$ : the DMT reduces to $d^*_{\rm eff}(r) = (k - r)(n_r - r)$ with $r_{\max} = \min(k, n_r)$ .

When this matters. Rank-deficient correlation arises in:

Line-of-sight (LOS) channels with few scatterers — the channel has only a handful of dominant directions.
Analog-beamforming / hybrid-precoding architectures, where the RF phase shifters constrain the effective transmit subspace to a small rank.
Keyhole channels (classical, rarely encountered) with rank- $1$ spatial correlation.

In all these cases the "effective $n_t$ " is smaller than the physical $n_t$ , and the DMT is governed by the reduced dimensionality. This connects Chapter 12 to Chapter 10's discussion of the keyhole effect and to Chapter 21's treatment of high-mobility correlation.

Quick Check

On a $4 \times 4$ i.i.d. Rayleigh channel with block length $L = 2$ , what is $r_{\max}^{\rm eff}$ ?

$r_{\max}^{\rm eff} = 4$ (the full spatial degrees of freedom)

$r_{\max}^{\rm eff} = 2$

$r_{\max}^{\rm eff} = 7$ (= $n_t + n_r - 1$ )

$r_{\max}^{\rm eff} = 0$ (no multiplexing possible)

Correction:

r_{\max}^{\rm eff} = 2

Thm. TDMT Truncation for Short Block Length: for $L < n_t + n_r - 1$ , the DMT is truncated at $r = L$ . At $L = 2$ , the achievable multiplexing gain is capped at $r = 2$ — beyond that, no positive diversity is possible.

Quick Check

A $4 \times 4$ channel has Tx correlation matrix $\mathbf{R}_t$ with $\det \mathbf{R}_t = 0.5 > 0$ (full rank). At $r = 2$ multiplexing, the DMT exponent is:

$d^*(2) = (4 - 2)(4 - 2) = 4$

$d^*(2) = 4 \cdot 0.5 = 2$ (diversity scales with $\det \mathbf{R}_t$ )

$d^*(2) = 0$ (any correlation kills the DMT)

Cannot determine without knowing the correlation eigenvalues

Correction:

d^*(2) = (4 - 2)(4 - 2) = 4

Thm. TDMT Invariance under Full-Rank Tx Correlation: full-rank correlation preserves the DMT exponent. $d^*(2)$ is the same as on i.i.d. Rayleigh: $(n_t - 2)(n_r - 2) = 4$ . The coding gain is degraded by the factor $(\det \mathbf{R}_t)^{-1/n_t} = 0.5^{-1/4} \approx 1.19$ (a $\sim 0.75$ dB loss), but this is invisible to the DMT exponent.

Why This Matters: Next: ARQ-DMT and Cooperative Diversity

Chapter 14 extends the DMT framework to ARQ-based MIMO systems, where the transmitter can retransmit after receiving a NACK. Each retransmission adds both diversity (the new fading realization is statistically independent) and rate flexibility (the effective code rate adapts to the channel). The ARQ-DMT of El Gamal-Caire- Damen 2004 gives a closed-form tradeoff curve that strictly exceeds the Zheng-Tse DMT — at the cost of feedback latency. 5G NR HARQ is a practical implementation of the ARQ-DMT.

Chapter 22 discusses the open problems: non-coherent DMT (when $L$ is too short for training), short-packet DMT (when asymptotic analysis breaks at finite blocklength), and finite-alphabet DMT (when Gaussian codebooks must be replaced with QAM/PSK for implementability).

Historical Note: CommIT Group Contributions to DMT Theory

2004–2006

The CommIT research group — led by Giuseppe Caire across appointments at Eurecom (1998–2010) and USC/TU Berlin (2010–) — made three foundational contributions to post-Zheng-Tse DMT theory:

Lattice codes achieve the DMT. El Gamal, Caire, and Damen (2004 IEEE Trans. IT) proved that LAST codes — lattice space-time codes built from dense lattice packings — achieve the entire Zheng-Tse DMT curve. This was the first explicit (albeit high-complexity) construction of DMT-optimal codes for arbitrary $n_t \times n_r$ channels. The proof uses the MMSE-GDFE receiver as the lattice-domain counterpart of the MMSE-SIC receiver for Gaussian codes. Chapter 17 builds on this.
Explicit DMT-optimal CDA codes. Elia, Kumar, Pawar, Kumar, Lu, and Caire (2006 IEEE Trans. IT) constructed explicit DMT- optimal space-time codes from cyclic division algebras (CDA) over number fields. These are shorter and simpler than the lattice codes of El Gamal-Caire-Damen 2004, with bounded-complexity sphere- decoder reception. The Golden code of Belfiore-Rekaya-Viterbo (2005) is the $2 \times 2$ instance; Elia et al. proved the general construction achieves the full DMT. Chapter 13 covers the CDA construction.
ARQ-DMT. El Gamal, Caire, and Damen (2006 IEEE Trans. IT) extended the DMT framework to incremental-redundancy HARQ systems, computing the tradeoff gains from each retransmission. The ARQ-DMT is the information-theoretic foundation of 5G NR HARQ. Chapter 14 covers this.

Collectively these three papers closed the circle: from Zheng-Tse's existence proof (Gaussian random codes) to explicit constructions (CDA, LAST) to practical ARQ extensions. The CDA / Perfect codes of Chapter 13 are the most mature of these and underlie the 3GPP Release 8 LTE-Advanced "tall" MIMO codes.

,

Key Takeaway

Diversity and multiplexing trade off along a precise curve. For an $n_t \times n_r$ i.i.d. Rayleigh MIMO channel with block length $L \ge n_t + n_r - 1$ , the Zheng-Tse DMT is the piecewise-linear interpolation of $(k, (n_t - k)(n_r - k))$ for $k = 0, 1, \ldots, \min(n_t, n_r)$ . Full-rank Tx correlation affects coding gain but not the DMT exponent. Short block length $L < n_t + n_r - 1$ truncates the curve at $r = L$ . Alamouti achieves full diversity only at $r = 0$ ; V-BLAST achieves full multiplexing only at $r = r_{\max}$ ; CDA / Golden codes (Chapter 13) achieve the entire curve.

Block Length and Correlation: DMT Refinements

Refining the DMT: When the Theorem's Hypotheses Break

Theorem: DMT Truncation for Short Block Length

Rank constraint on codewords

LP with block-length constraint

Truncation formula

DMT Truncation as Block Length LLL Varies

Parameters

Theorem: DMT Invariance under Full-Rank Tx Correlation

Eigenvalue scaling lemma

Exponent invariance

Outage exponent unchanged

DMT under Tx Spatial Correlation

Parameters

Coherence Time and DMT Block Length in 5G NR

Measured Tx Correlation in LTE/5G: ρ∈[0.3,0.7]\rho \in [0.3, 0.7]ρ∈[0.3,0.7]

Common Mistake: Rank-Deficient Correlation Does Break DMT

Quick Check

Quick Check

Why This Matters: Next: ARQ-DMT and Cooperative Diversity

Historical Note: CommIT Group Contributions to DMT Theory

Key Takeaway

DMT Truncation as Block Length $L$ Varies

Measured Tx Correlation in LTE/5G: $\rho \in [0.3, 0.7]$