The mod-Λ\Lambda Scheme: Erez–Zamir Capacity

Why Lattices Should Achieve AWGN Capacity

Shannon's 1948 proof that a Gaussian codebook achieves the AWGN capacity 12log2(1+SNR)\tfrac12 \log_2(1 + \text{SNR}) is famously non-constructive. The codewords are drawn i.i.d. from a Gaussian distribution, so the code has no algebraic structure whatsoever — no linearity, no symmetry, no efficient encoder or decoder. For half a century, the open question was: can a structured codebook (one with algebraic linearity, so that encoder and decoder are as cheap as matrix-vector products) achieve the same capacity?

The point is that lattices are the natural structured codebook for a Gaussian channel. Every lattice operation is linear. Every lattice codeword is an integer combination of basis vectors, and the lattice "universe" Rn\mathbb{R}^n matches the Gaussian noise support. So lattices should work — but the naïve scheme (take a dense lattice, truncate to a power constraint, decode to the nearest lattice point) leaves a gap that nobody could close for 56 years.

Erez and Zamir, in their 2004 IEEE Trans. IT paper, finally closed it. Their scheme has four moving parts:

  1. A good fine lattice Λc\Lambda_c (provides the coding gain);
  2. A good coarse lattice Λs\Lambda_s (provides the shaping gain);
  3. An MMSE scalar α=SNR/(1+SNR)\alpha = \text{SNR}/(1 + \text{SNR}) at the receiver (the single scalar that Shannon's proof doesn't need but lattices do);
  4. A random dither d\mathbf{d}, uniform on V(Λs)\mathcal{V}(\Lambda_s) and known at both ends, that converts the input to an unconstrained AWGN input via the crypto-lemma.

The "proof pattern" of inflating a bounded input to look like an unbounded Gaussian via dither + modulo is the trick that recurs throughout lattice coding theory — in multiple-access, broadcast, interference, and compute-and-forward. Master it here, and the rest of Part IV falls into place.

,

Definition:

MMSE Coefficient α\alpha

For the AWGN channel y=x+w\mathbf{y} = \mathbf{x} + \mathbf{w} with power-constrained input E[x2]/nP\mathbb{E}[\|\mathbf{x}\|^2]/n \le P and noise variance σ2\sigma^2 per dimension, the MMSE coefficient is α  =  PP+σ2  =  SNR1+SNR.\alpha \;=\; \frac{P}{P + \sigma^2} \;=\; \frac{\text{SNR}}{1 + \text{SNR}}. It is the unique scalar that minimises E[xαy2]\mathbb{E}[\|\mathbf{x} - \alpha \mathbf{y}\|^2] when x\mathbf{x} is zero-mean with second moment PP and uncorrelated with the noise.

α\alpha is strictly less than 11: the MMSE estimator shrinks the received signal towards zero because the noise adds energy that the estimator can discount. The parameter 1α=σ2/(P+σ2)1 - \alpha = \sigma^2 / (P + \sigma^2) is the noise share of the total energy. The specific choice α=SNR/(1+SNR)\alpha = \text{SNR}/(1 + \text{SNR}) is what makes the residual noise after scaling z=αyx\mathbf{z} = \alpha \mathbf{y} - \mathbf{x} have the smallest possible variance ασ2\alpha \sigma^2 — a factor of α<1\alpha < 1 smaller than the raw channel noise.

,

Definition:

Dither Vector

A dither is a random vector dRn\mathbf{d} \in \mathbb{R}^n uniformly distributed on the Voronoi region V(Λs)\mathcal{V}(\Lambda_s) of the shaping lattice, drawn independently of the message and the channel noise, and known to both transmitter and receiver (e.g., via a shared pseudo-random seed).

The dither is the ingredient that makes the transmitted signal x=[xd]modΛs\mathbf{x}' = [\mathbf{x} - \mathbf{d}] \bmod \Lambda_s unconditionally uniform on V(Λs)\mathcal{V}(\Lambda_s), regardless of which codeword x\mathbf{x} was sent.

The dither is not transmitted; it is a pre-agreed random seed. In practice the shared randomness is generated by a pseudo-random sequence seeded with a public frame index (as in scrambling codes in 3GPP and DVB). The dither's only purpose is analytic: it lets us apply the crypto-lemma and reduce the bounded-input AWGN problem to an unbounded-input lattice-decoding problem.

,

Theorem: Crypto-Lemma (Zamir–Feder)

Let xRn\mathbf{x} \in \mathbb{R}^n be an arbitrary (possibly deterministic) vector, and let d\mathbf{d} be uniform on V(Λ)\mathcal{V}(\Lambda) and independent of x\mathbf{x}. Then [x+d]modΛ[\mathbf{x} + \mathbf{d}] \bmod \Lambda is uniformly distributed on V(Λ)\mathcal{V}(\Lambda) and independent of x\mathbf{x}.

Adding an independent uniform dither on V(Λ)\mathcal{V}(\Lambda) and then reducing mod-Λ\Lambda spreads any input x\mathbf{x} to the uniform distribution — because any rigid translation of a uniform distribution on a fundamental region gives a uniform distribution on the (translated, then-modulo-reduced) fundamental region, and the modulo reduction maps back into the canonical Voronoi cell. This is the lattice analogue of the one-time pad.

,

Erez–Zamir Achievability: mod-Λ\Lambda vs Shannon vs Uncoded QAM

Rate curves as a function of SNR (dB) per real dimension: (i) the AWGN capacity C=12log2(1+SNR)C = \tfrac12 \log_2(1 + \text{SNR}); (ii) the rate achieved by the Erez–Zamir mod-Λ\Lambda scheme with optimal Λc,Λs\Lambda_c, \Lambda_s — which coincides with CC at every SNR; (iii) the rate of uncoded MM-QAM at a fixed target symbol error probability of 10510^{-5}, showing the gap to capacity that classical truncated-cube constellations leave on the table. For small MM the gap is 9\approx 9 dB at high SNR; Erez–Zamir closes it entirely. Increasing MM asymptotically recovers the capacity but at exponentially growing complexity, whereas the lattice scheme achieves capacity with polynomial nn.

Parameters

Erez–Zamir mod-Λ\Lambda Encoder and Decoder

Complexity: Encoder: O(n)O(n) for mod-Λs\Lambda_s (if Λs\Lambda_s has a fast quantiser). Decoder: dominated by step 5 — the closest Λc\Lambda_c point, which is a CLP problem (s05).
Setup. Fine lattice Λc\Lambda_c, coarse lattice ΛsΛc\Lambda_s \subset \Lambda_c, both in Rn\mathbb{R}^n; codebook
C=ΛcV(Λs)\mathcal{C} = \Lambda_c \cap \mathcal{V}(\Lambda_s).
Power P=P = second moment of V(Λs)\mathcal{V}(\Lambda_s).
Shared dither d\mathbf{d} uniform on V(Λs)\mathcal{V}(\Lambda_s).
Encoder (given message uu):
1. Map u{0,,C1}u \in \{0, \ldots, |\mathcal{C}| - 1\} to c(u)C\mathbf{c}(u) \in \mathcal{C} by a fixed enumeration.
2. Transmit x[c(u)d]modΛs\mathbf{x} \leftarrow [\mathbf{c}(u) - \mathbf{d}] \bmod \Lambda_s.
Channel: receiver observes y=x+w\mathbf{y} = \mathbf{x} + \mathbf{w}, with wN(0,σ2I)\mathbf{w} \sim \mathcal{N}(0, \sigma^2 \mathbf{I}).
Decoder:
3. Compute MMSE-scaled receive yαy\mathbf{y}' \leftarrow \alpha \mathbf{y}, with α=P/(P+σ2)\alpha = P / (P + \sigma^2).
4. Add dither back and reduce mod-Λs\Lambda_s:
r[y+d]modΛs\mathbf{r} \leftarrow [\mathbf{y}' + \mathbf{d}] \bmod \Lambda_s.
5. Decode to the nearest Λc\Lambda_c point:
c^QΛc(r)\hat{\mathbf{c}} \leftarrow Q_{\Lambda_c}(\mathbf{r}).
6. Un-enumerate: u^c1(c^)\hat{u} \leftarrow \mathbf{c}^{-1}(\hat{\mathbf{c}}).

Steps 3–4 together reduce the coded AWGN channel to an unconstrained lattice AWGN channel: receiving point r\mathbf{r}, decoding the closest Λc\Lambda_c-point without any power constraint. The Erez–Zamir theorem below shows that this reduction loses nothing — the capacity of the unconstrained lattice channel at effective SNR SNR\text{SNR} equals the AWGN capacity.

Theorem: Erez–Zamir (2004): Lattice Codes Achieve AWGN Capacity

For every R<C=12log2(1+SNR)R < C = \tfrac12 \log_2(1 + \text{SNR}) (bits per real dimension) and every ε>0\varepsilon > 0, there exist nested lattices ΛsΛcRn\Lambda_s \subset \Lambda_c \subset \mathbb{R}^n (for some n<n < \infty) and a dither d\mathbf{d} such that the mod-Λs\Lambda_s scheme (Algorithm above) with MMSE coefficient α=SNR/(1+SNR)\alpha = \text{SNR}/(1 + \text{SNR}) achieves rate RR with average error probability Pe<εP_e < \varepsilon under a power constraint PP.

The proof has three moves, each earning a "good" lattice. First, the dither + mod-Λs\Lambda_s + MMSE reduces the bounded-input AWGN channel to an unbounded lattice channel with effective noise z=(1α)(d)+αw\mathbf{z} = (1-\alpha)(-\mathbf{d}) + \alpha \mathbf{w} — a mixture of dither and scaled noise, whose second moment is exactly ασ2\alpha \sigma^2 (the MMSE error power). Second, by the Loeliger random-lattice averaging, there exists a fine lattice Λc\Lambda_c whose decoding error probability on a pure Gaussian channel at SNR α/(α)=1/α×\alpha / (\alpha) = 1/\alpha \times \ldots vanishes at rates up to the Poltyrev capacity 12log2(1/G(Λs)2πeασ2)\tfrac12 \log_2 (1 / G(\Lambda_s) \cdot 2 \pi e \cdot \alpha \sigma^2). Third, for a Voronoi-shaping lattice Λs\Lambda_s with G(Λs)1/(2πe)G(\Lambda_s) \to 1/(2 \pi e) (Rogers, Poltyrev), this rate equals 12log2(1+SNR)\tfrac12 \log_2 (1 + \text{SNR}) exactly. The three moves compose to close the full capacity gap.

, ,

Erez–Zamir mod-Λ\Lambda Scheme: Achieving AWGN Capacity with Lattices

Animation of the Erez–Zamir mod-Λ\Lambda encoder and decoder, step by step: input uu maps to a fine-lattice codeword x\mathbf{x} inside V(Λs)\mathcal{V}(\Lambda_s); a pseudo-random dither d\mathbf{d} is subtracted and the result is reduced mod-Λs\Lambda_s; the transmit signal passes through the AWGN channel; the receiver MMSE-scales by α\alpha, adds back the dither, and takes a mod-Λs\Lambda_s reduction, collapsing the lattice-wrapped Gaussian noise into a single Voronoi cell; finally, a closest-Λc\Lambda_c search recovers x^\hat{\mathbf{x}} and the message u^\hat{u}. The visualisation highlights the role of the dither (spreading the input to a uniform distribution on V(Λs)\mathcal{V}(\Lambda_s)) and the MMSE scaling (concentrating the effective noise into the smallest-variance Gaussian-like wrap-around).
Four stages of the mod-Λ\Lambda scheme: encoder (dither subtract + mod-Λs\Lambda_s), AWGN channel, decoder (MMSE α\alpha + dither add + mod-Λs\Lambda_s), and Λc\Lambda_c closest-point lookup. The scheme achieves 12log2(1+SNR)\tfrac12 \log_2(1 + \text{SNR}) exactly.

Example: mod-Λ\Lambda at Rate R=2R = 2 bits per Real Dimension

For a nested lattice code with Λc=Zn\Lambda_c = \mathbb{Z}^n and Λs=4Zn\Lambda_s = 4 \mathbb{Z}^n, the rate is R=1nlog2Λc/Λs=log24=2R = \tfrac{1}{n} \log_2 |\Lambda_c / \Lambda_s| = \log_2 4 = 2 bits per real dimension. By the Erez–Zamir theorem, what is the minimum SNR at which this rate is achievable in the limit of large nn? Compare to the SNR required by an uncoded 16-QAM (rate 22) constellation at error probability 10510^{-5}, and to a cubic- shaping (Λs=\Lambda_s = scaled cube) lattice code.

⚠️Engineering Note

Why the MMSE Coefficient Matters: 56 Years of Progress

From Shannon's 1948 paper until Erez–Zamir's 2004 paper, it was known that un-scaled lattice decoding (the so-called "inflated lattice" argument of de Buda, 1989, and Urbanke–Rimoldi, 1998) could achieve only 12log2(SNR/(2πe/12))\tfrac12 \log_2(\text{SNR}/(2 \pi e / 12)), about 1.531.53 dB short of the Shannon capacity at high SNR. The gap was the absence of the MMSE scalar α\alpha; without it, the effective noise after mod-Λs\Lambda_s has variance σ2\sigma^2, not ασ2\alpha \sigma^2, and the shaping- gain factor 2πe/122\pi e / 12 appears as a residual loss.

The Erez–Zamir insight was subtle but decisive: scaling by α<1\alpha < 1 before the modulo operation is the receiver's analogue of the MMSE-DFE, and it converts the "un-inflated" lattice channel into a capacity-achieving one. The MMSE factor is a fixed scalar determined by SNR; in practical receivers it requires only AGC-level calibration.

Practical Constraints
  • Receiver must know SNR to compute α=SNR/(1+SNR)\alpha = \text{SNR}/(1 + \text{SNR})

  • Transmitter and receiver must share the dither (pseudo-random seed)

  • Sensitive to SNR mismatch at low SNR: a 11 dB error in SNR\text{SNR} estimate can cost 0.1\sim 0.1 dB

📋 Ref: Erez–Zamir (2004); Zamir 2014 ch. 7

Common Mistake: MMSE Scaling α\alpha Is Not the Same as Zero-Forcing

Mistake:

Confusing the Erez–Zamir MMSE scalar α=SNR/(1+SNR)\alpha = \text{SNR}/(1 + \text{SNR}) with a zero-forcing or matched-filter receiver. A zero-forcing receiver would use α=1\alpha = 1 (no scaling); a matched-filter receiver would use α=\alpha = signal-to-total power ratio, depending on how "total" is defined. Neither choice achieves the AWGN capacity with lattice codes.

Correction:

The MMSE scalar α<1\alpha < 1 is the exact scalar that minimises the per-dimension expected-square error E[(xαy)2]\mathbb{E}[(x - \alpha y)^2] when xN(0,P)x \sim \mathcal{N}(0, P) and y=x+wy = x + w with wN(0,σ2)w \sim \mathcal{N}(0, \sigma^2). It is the orthogonality-principle solution. Using α=1\alpha = 1 introduces an information-theoretic loss of exactly 1.531.53 dB at high SNR (the shaping-gain ceiling reappearing as a residual). The MMSE α\alpha is what makes the effective noise after modulo reduction have variance ασ2\alpha \sigma^2 rather than σ2\sigma^2, closing precisely this 1.531.53 dB gap.

Capacity Gap = Coding Gap + Shaping Gap (Additive in dB)

Once the Erez–Zamir theorem is in hand, it sharpens an earlier intuition of Forney–Ungerboeck into a quantitative statement. For any finite-dimensional lattice code (Λc,Λs)(\Lambda_c, \Lambda_s), write the rate gap to capacity as CR  =  Δccoding-gain shortfall+Δsshaping-gain shortfall.C - R \;=\; \underbrace{\Delta_c}_{\text{coding-gain shortfall}} + \underbrace{\Delta_s}_{\text{shaping-gain shortfall}}. These two terms are additive (in dB) and independent: Δc\Delta_c depends only on Λc\Lambda_c, Δs\Delta_s only on Λs\Lambda_s. For the designer this means: pick any good fine lattice you like (for its algebraic nice-ness and decoder complexity), pick any good coarse lattice you like (for its second-moment closeness to a ball), and the overall gap is the sum of the two. This is the operational payoff of the Erez–Zamir proof — it transforms "achieving capacity" into two decoupled engineering problems, neither of which is easy but both of which are well-understood.

,

Quick Check

In the Erez–Zamir mod-Λ\Lambda scheme with MMSE coefficient α=SNR/(1+SNR)\alpha = \text{SNR}/(1+\text{SNR}), the effective noise after modulo reduction has per-dimension second moment equal to:

σ2\sigma^2 (the raw channel noise variance)

ασ2\alpha \, \sigma^2

(1α)σ2(1 - \alpha) \, \sigma^2

PP (the transmit power)

Key Takeaway

The Erez–Zamir mod-Λ\Lambda scheme achieves the AWGN capacity 12log2(1+SNR)\tfrac12 \log_2(1 + \text{SNR}) with a structured codebook, by combining a good fine lattice Λc\Lambda_c (via Loeliger's random lattices), a good coarse lattice Λs\Lambda_s (with normalised second moment G(Λs)1/(2πe)G(\Lambda_s) \to 1/(2 \pi e)), an MMSE scalar α=SNR/(1+SNR)\alpha = \text{SNR}/(1 + \text{SNR}), and a shared dither d\mathbf{d} uniform on V(Λs)\mathcal{V}(\Lambda_s). The crypto-lemma turns the bounded-input channel into an unconstrained lattice channel; the MMSE factor shrinks the effective noise by α\alpha; and optimising Λs\Lambda_s kills the πe/61.53\pi e / 6 \approx 1.53 dB shaping gap. The capacity gap of any finite lattice code decomposes additively (in dB) into a coding gap and a shaping gap — two independent knobs for the designer.