The ARQ-DMT Theorem

Zheng-Tse, One Dimension Up: The Delay Axis

Zheng and Tse (Ch. 12) gave us a two-dimensional tradeoff: diversity dd versus multiplexing rr, on a single block. Their theorem assumes one transmission attempt, and the curve dβˆ—(r)d^{*}(r) captures everything about that one attempt. The operational question of this chapter is: what is the right generalisation when the system retransmits?

El Gamal, Caire, and Damen answered this question in 2006 with a third axis β€” the delay axis LL, defined as the maximum number of ARQ rounds allowed. The resulting object is a three-dimensional tradeoff surface dARQ(r,L)d_\mathrm{ARQ}(r, L), and its shape is remarkably clean: dARQ(r,L)β€…β€Š=β€…β€ŠLβ‹…dβˆ—(r/L)\boxed{\quad d_\mathrm{ARQ}(r, L) \;=\; L \cdot d^{*}(r/L) \quad} where dβˆ—(β‹…)d^{*}(\cdot) is the static Zheng-Tse DMT curve. The formula has two operational readings:

  1. Linear-in-LL gain at fixed rr. For any fixed long-term effective rate rr, each additional allowed round multiplies the diversity exponent by the factor dβˆ—(r/L)/dβˆ—(r/(Lβˆ’1))d^{*}(r/L) / d^{*}(r/(L-1)) β€” and since dβˆ—d^{*} is non-increasing, this factor is β‰₯1\ge 1. At high rr (steep part of dβˆ—d^{*}), each round earns a lot; at low rr (flat part of dβˆ—d^{*} near r=0r = 0), each round earns less.
  2. Rate-stretching at fixed LL. For any fixed LL, the ARQ-DMT "stretches" the static curve horizontally by a factor of LL: dARQ(r,L)d_\mathrm{ARQ}(r, L) at effective multiplexing rr is LL copies of dβˆ—d^{*} evaluated at r/Lr/L. Another way to say it: the ARQ channel has LL times the spatial degrees of freedom of a single-shot channel, in the DMT sense.

The proof uses the same pattern as Zheng-Tse: outage-based converse plus Gaussian random-coding achievability. The Wishart large-deviations machinery of Ch. 12 lifts mechanically to the LL-round setting; the only change is that we track the cumulative mutual information Iβ„“=βˆ‘k≀ℓI(Hk)I_\ell = \sum_{k \le \ell} I(\mathbf{H}_{k}) instead of a single round's mutual information. The proof pattern recurs in ARQ-DMT variants for correlated fading, partial CSIT, and multi-user channels β€” it is a pattern-aware proof in the sense of Ch. 12.

This section states the theorem, gives the full proof, and unpacks its operational implications.

,

Definition:

ARQ Diversity-Multiplexing-Delay Tradeoff dARQ(r,L)d_\mathrm{ARQ}(r, L)

Consider an LL-round ARQ protocol on an ntΓ—nrn_t \times n_r MIMO channel with i.i.d. realisations H1,…,HL\mathbf{H}_{1}, \ldots, \mathbf{H}_{L}. Let R(SNR)R(\text{SNR}) be the per-round target rate and RΛ‰(SNR)\bar R( \text{SNR}) be the long-term effective rate (bits per channel use averaged across successful and unsuccessful rounds weighted by probability). Define

  • Effective multiplexing gain: rβ€…β€Š=β€…β€Šlim⁑SNRβ†’βˆžRΛ‰(SNR)log⁑2SNR.r \;=\; \lim_{\text{SNR}\to\infty} \frac{\bar R(\text{SNR})}{\log_2 \text{SNR}}.
  • Diversity gain: dβ€…β€Š=β€…β€Šβˆ’lim⁑SNRβ†’βˆžlog⁑Pe(SNR)log⁑SNR,d \;=\; -\lim_{\text{SNR}\to\infty} \frac{\log P_e(\text{SNR})}{\log \text{SNR}}, where PeP_e is the probability that the protocol ends in a block error (NACK on the LL-th round).

The ARQ diversity-multiplexing-delay tradeoff curve is dARQ(r,L)β€…β€Š=β€…β€Šsup⁑{dβ€…β€Š:β€…β€ŠβˆƒΒ ARQΒ protocolΒ ofΒ roundΒ budgetΒ LΒ withΒ effectiveΒ multiplexingΒ rΒ andΒ diversityΒ d}.d_\mathrm{ARQ}(r, L) \;=\; \sup\{d \;:\; \exists \text{ ARQ protocol of round budget } L \text{ with effective multiplexing } r \text{ and diversity } d\}. Compared to the static DMT dβˆ—(r)d^{*}(r) of Ch. 12, the ARQ-DMT has an extra parameter LL encoding the protocol's delay budget.

Three definitional subtleties are worth flagging.

(i) The rate rr is the long-term effective rate, not the per-round rate. If the per-round rate is RR and the average number of rounds used is Lˉ\bar L, then the long-term rate is Rˉ=R⋅L/Lˉ\bar R = R \cdot L / \bar L. At high SNR with IR-HARQ, Lˉ→1\bar L \to 1 (the first round almost always succeeds), so Rˉ→LR\bar R \to L R — i.e., the per-round rate is r/Lr / L in the limit.

(ii) The diversity dd is the slope of block error probability after all LL rounds (not per-round). After the LL-th NACK, the protocol has no more budget and the block is declared in error.

(iii) In the L=1L = 1 limit (no ARQ), dARQ(r,1)=dβˆ—(r)d_\mathrm{ARQ}(r, 1) = d^{*}(r) exactly β€” the ARQ-DMT reduces to the static Zheng-Tse tradeoff. This is the boundary condition the general formula must satisfy.

Theorem: ARQ-DMT (El Gamal-Caire-Damen 2006)

On an ntΓ—nrn_t \times n_r i.i.d. Rayleigh block-fading MIMO channel with block length Nβ‰₯nt+nrβˆ’1N \ge n_t + n_r - 1 per round and at most LL ARQ rounds, the optimal ARQ diversity-multiplexing-delay tradeoff is dARQ(r,L)β€…β€Š=β€…β€ŠLβ‹…dβˆ—(r/L),r∈[0, Lβ‹…min⁑(nt,nr)],d_\mathrm{ARQ}(r, L) \;=\; L \cdot d^{*}(r/L), \qquad r \in [0,\, L \cdot \min(n_t, n_r)], where dβˆ—(β‹…)d^{*}(\cdot) is the Zheng-Tse static DMT curve of Chapter 12. The tradeoff is achieved by a random Gaussian codebook of mother- code rate rlog⁑2SNRr \log_2 \text{SNR} operated via incremental redundancy, with joint ML decoding across the accumulated rounds.

Think of the LL-round ARQ protocol as a single code over a "virtual" (Lnt)Γ—(Lnr)(L n_t) \times (L n_r) block-diagonal channel diag(H1,…,HL)\mathrm{diag}(\mathbf{H}_{1}, \ldots, \mathbf{H}_{L}). This virtual channel has Lβ‹…ntnrL \cdot n_t n_r independent fading coefficients β€” hence LL times the diversity budget of a single round.

The catch is that the virtual channel has a very special block-diagonal structure: the β„“\ell-th block sees only codeword fragment Xβ„“\mathbf{X}_\ell, not X1\mathbf{X}_1 or XL\mathbf{X}_L. For incremental redundancy with a mother code of rate rr, this structure is no obstacle: the IR decoder sees a concatenated codeword of total rate rr over Lβ‹…NntL \cdot N n_t channel uses, i.e., effective per-virtual-antenna rate r/Lr / L. Operating at this lowered per-round rate and collecting LL independent fading draws gives the exponent Lβ‹…dβˆ—(r/L)L \cdot d^{*}(r/L).

One way to remember the formula: the ARQ system multiplies the DMT curve of Ch. 12 by LL in both axes. The static DMT goes from (0,ntnr)(0, n_t n_r) to (min⁑(nt,nr),0)(\min(n_t, n_r), 0); the ARQ-DMT goes from (0,Lntnr)(0, L n_t n_r) to (Lmin⁑(nt,nr),0)(L \min(n_t, n_r), 0). Same shape, stretched by LL in both dimensions.

,

The ARQ-DMT Curve dARQ(r,L)d_\mathrm{ARQ}(r, L)

Explore the ARQ-DMT as a family of curves indexed by LL. The L=1L = 1 curve is the static Zheng-Tse DMT from Ch. 12. For Lβ‰₯2L \ge 2, each additional round stretches the curve horizontally by a factor of LL and scales it vertically by LL as well β€” the resulting surface is strictly above the static curve for all r>0r > 0. Use the slider to walk up LL and watch the curve grow.

Parameters
2
2
3

Example: ARQ-DMT Corners for 2Γ—22\times 2 MIMO at L=2,3,4L = 2, 3, 4

For a 2Γ—22 \times 2 i.i.d. Rayleigh channel with per-round block length Nβ‰₯3N \ge 3, list the ARQ-DMT corner points for L=2,3,4L = 2, 3, 4. Use dβˆ—(r)=(2βˆ’r)2d^{*}(r) = (2 - r)^2 at integer corners of the static curve with linear interpolation.

The Lβ†’βˆžL \to \infty Limit: Ergodic Capacity Regime

What happens as Lβ†’βˆžL \to \infty? The ARQ-DMT formula dARQ(r,L)=Lβ‹…dβˆ—(r/L)d_\mathrm{ARQ}(r, L) = L \cdot d^{*}(r/L) has a finite limit at the origin of the static DMT: dβˆ—(r/L)β†’dβˆ—(0)=ntnrd^{*}(r/L) \to d^{*}(0) = n_t n_r as Lβ†’βˆžL \to \infty for any fixed rr. Hence lim⁑Lβ†’βˆždARQ(r,L)β€…β€Š=β€…β€Šβˆž.\lim_{L \to \infty} d_\mathrm{ARQ}(r, L) \;=\; \infty. The diversity grows without bound β€” there is no saturation, because each additional round always adds ntnrn_t n_r to the exponent.

But the practical interpretation of "diversity =∞= \infty" is that the error probability decays faster than any polynomial in SNR β€” i.e., it decays like eβˆ’cβ‹…SNRe^{-c \cdot \text{SNR}} or faster, the regime characteristic of ergodic capacity. When the number of independent fading realisations is unbounded, the law of large numbers kicks in, the per-round mutual information averages to its expectation Elog⁑det⁑(I+SNRntHHH)\mathbb{E}\log\det(\mathbf{I} + \tfrac{\text{SNR}}{n_t}\mathbf{H}\mathbf{H}^{H}), and the outage probability at any rate below the ergodic capacity tends to zero faster than any polynomial.

This recovers the Telatar ergodic-capacity picture of Ch. 10: with enough independent looks, the channel acts as an AWGN channel with gain Elog⁑det⁑(⋯ )\mathbb{E}\log\det(\cdots). The ARQ-DMT formula interpolates smoothly between Zheng-Tse's single-shot DMT (L=1L = 1) and Telatar's ergodic capacity (L=∞L = \infty) β€” a lovely conceptual unification.

,

Common Mistake: The Rate Convention: Per-Round vs Long-Term

Mistake:

Confusing the per-round rate RR with the long-term effective rate RΛ‰\bar R, and ending up with Lβ‹…dβˆ—(r)L \cdot d^{*}(r) instead of Lβ‹…dβˆ—(r/L)L \cdot d^{*}(r/L) or vice versa.

Correction:

Be explicit about the rate convention.

  • Per-round multiplexing rprr_\mathrm{pr}: the per-round rate R=rprlog⁑2SNRR = r_\mathrm{pr} \log_2 \text{SNR} divided by log⁑2SNR\log_2 \text{SNR}. With this convention the ARQ-DMT is Lβ‹…dβˆ—(rpr)L \cdot d^{*}(r_\mathrm{pr}).
  • Long-term effective multiplexing rr: the cumulative rate LR=Lrprlog⁑2SNRL R = L r_\mathrm{pr} \log_2 \text{SNR} divided by Llog⁑2SNRL \log_2 \text{SNR}. That is, r=rprr = r_\mathrm{pr} if we measure "rate per channel use averaged across the whole LL-round block." With this convention the ARQ-DMT is Lβ‹…dβˆ—(r/L)L \cdot d^{*}(r / L)... wait.

Actually the standard convention in El Gamal-Caire-Damen is: rr is the first-round multiplexing gain, i.e., the per-round rate expressed in units of log⁑2SNR\log_2 \text{SNR}. Their formula Lβ‹…dβˆ—(r/L)L \cdot d^{*}(r/L) has rr being the first-round rate times LL (equivalently: the total number of information bits divided by the first-round channel-use count times log⁑2SNR\log_2 \text{SNR}).

The simplest way to avoid confusion is to fix the total information bits as LR0L R_0 where R0R_0 is the "baseline" per-round rate. The outage is {IL<LR0}\{I_L < L R_0\} i.e., \{(\text{sum ofLi.i.d. channel MIs}) < L R_0\}. The exponent of this is Lβ‹…dβˆ—(R0/log⁑2SNR)L \cdot d^{*}( R_0 / \log_2 \text{SNR}) in the high-SNR limit, where R0/log⁑2SNRβ†’r0R_0 / \log_2 \text{SNR} \to r_0 is the long-term effective multiplexing divided by LL (since the long-term effective rate seen by the application is LR0/LN=R0/NL R_0 / L N = R_0 / N per channel use). The ARQ-DMT formula is consistent; the notation is just a minefield. When in doubt, compute both conventions and verify limiting cases: L=1β‡’dβˆ—(r)L = 1 \Rightarrow d^{*}(r); Lβ†’βˆž,rL \to \infty, r fixed β‡’βˆž\Rightarrow \infty.

Quick Check

On a 4Γ—44 \times 4 i.i.d. Rayleigh MIMO channel with L=2L = 2 ARQ rounds and long-term effective multiplexing gain r=2r = 2, the ARQ-DMT diversity dARQ(2,2)d_\mathrm{ARQ}(2, 2) is (use dβˆ—(r)=(4βˆ’r)(4βˆ’r)d^{*}(r) = (4-r)(4-r) at integer corners with piecewise-linear interpolation)

dARQ(2,2)=2β‹…dβˆ—(1)=2β‹…9=18d_\mathrm{ARQ}(2, 2) = 2 \cdot d^{*}(1) = 2 \cdot 9 = 18

dARQ(2,2)=2β‹…dβˆ—(2)=2β‹…4=8d_\mathrm{ARQ}(2, 2) = 2 \cdot d^{*}(2) = 2 \cdot 4 = 8

dARQ(2,2)=dβˆ—(2)=4d_\mathrm{ARQ}(2, 2) = d^{*}(2) = 4

dARQ(2,2)=Lβ‹…dβˆ—(rβ‹…L)=2β‹…dβˆ—(4)=0d_\mathrm{ARQ}(2, 2) = L \cdot d^{*}(r \cdot L) = 2 \cdot d^{*}(4) = 0

Quick Check

As Lβ†’βˆžL \to \infty with the long-term effective rate rr held fixed, the ARQ-DMT diversity dARQ(r,L)d_\mathrm{ARQ}(r, L)

Diverges to ∞\infty β€” the regime is asymptotically ergodic

Saturates at ntnrn_t n_r β€” the single-round diversity maximum

Reduces to dβˆ—(r)d^{*}(r) β€” the static DMT

Saturates at Lβ‹…rL \cdot r β€” the Shannon limit

Historical Note: El Gamal-Caire-Damen 2006: The ARQ Dimension of DMT

2006

The August 2006 IEEE Trans. Inform. Theory paper by Hesham El Gamal, Giuseppe Caire, and Mohamed Oussama Damen, "The MIMO ARQ Channel: Diversity-Multiplexing-Delay Tradeoff" (vol. 52, no. 8, pp. 3601–3621), is the paper that put the delay dimension on the DMT map. At the time, Zheng-Tse was three years old and the community had absorbed the diversity-multiplexing duality, but everyone's intuition about ARQ came from the rate-flat Wozencraft-Jacobs era: ARQ trades throughput for reliability. The El Gamal-Caire-Damen paper reframed the question on a fading MIMO channel: how does ARQ change the fundamental tradeoff curve, not just the operating point?

Their answer β€” dARQ(r,L)=Lβ‹…dβˆ—(r/L)d_\mathrm{ARQ}(r, L) = L \cdot d^{*}(r/L) β€” is remarkably clean, and it inaugurated the study of "diversity- multiplexing-delay" as a three-dimensional tradeoff. The paper also gave the first explicit code construction achieving the ARQ-DMT: incremental-redundancy lattice space-time codes (IR-LAST), built from nested cyclic division algebra codes with random dithering. The code-construction half of the paper is a tour de force that required prior CDA machinery (Elia-Kumar-Pawar-Kumar-Lu-Caire 2006, Ch. 13) and prior LAST-code theory (El Gamal-Caire-Damen 2004, forward ref Ch. 17). We discuss IR-LAST in detail in Β§3.

The ARQ-DMT theorem has since become the information-theoretic foundation of HARQ design in cellular standards. Every 3GPP link-level simulation that reports BLER-vs-SNR curves for HARQ-IR is implicitly validating the ARQ-DMT scaling prediction. The paper's operational reach far exceeds its citation count β€” which is already substantial.

ARQ-DMT (Diversity-Multiplexing-Delay Tradeoff)

The three-dimensional generalisation of the Zheng-Tse DMT to ARQ channels. For an ntΓ—nrn_t \times n_r i.i.d. Rayleigh channel with at most LL ARQ rounds, the optimal diversity-multiplexing-delay tradeoff is dARQ(r,L)=Lβ‹…dβˆ—(r/L)d_\mathrm{ARQ}(r, L) = L \cdot d^{*}(r/L), where dβˆ—d^{*} is the static DMT of Ch. 12 (El Gamal-Caire-Damen 2006). Achieved by incremental-redundancy Gaussian random codes and explicitly by IR-LAST codes.

Related: Diversity-Multiplexing Tradeoff Curve dβˆ—(r)d^*(r), 5G NR HARQ Process, Incremental Redundancy (IR-HARQ), Incremental-Redundancy Lattice Space-Time (IR-LAST) Codes

Long-Term Effective Multiplexing Gain

The limiting rate slope r=lim⁑SNRβ†’βˆžRΛ‰(SNR)/log⁑2SNRr = \lim_{\text{SNR}\to\infty} \bar R( \text{SNR}) / \log_2 \text{SNR}, where RΛ‰\bar R is the time-average rate delivered by the ARQ protocol, accounting for NACKs and retransmissions. Distinguished from the per-round multiplexing gain rprr_\mathrm{pr} by a factor of LL; see ⚠The Rate Convention: Per-Round vs Long-Term.

Related: ARQ Diversity-Multiplexing-Delay Tradeoff dARQ(r,L)d_\mathrm{ARQ}(r, L), Multiplexing Gain rr, Rate Matching