Ferkans — Interactive Telecom Tutor

Writing on Dirty Paper: Costa's Surprise

Imagine you want to send a message across a noisy channel where the output is corrupted by a large known interference $\mathbf{s}$ — known only to the transmitter before it encodes: $\mathbf{y} \;=\; \mathbf{x} + \mathbf{s} + \mathbf{w}.$ If $\mathbf{s}$ were random and unknown, the capacity would drop by $\tfrac12 \log_2(1 + \text{SNR}_\mathbf{s})$ — interference reduces the effective SNR in the obvious way. If $\mathbf{s}$ were known at both ends, the transmitter could simply subtract it: $\mathbf{x} := \mathbf{x}_0 - \mathbf{s}$ yields $\mathbf{y} = \mathbf{x}_0 + \mathbf{w}$ , the clean-channel capacity, at a power cost (since $\mathbf{x}$ now has higher second moment).

The question Costa asked in 1983 was the intermediate case: what if $\mathbf{s}$ is known only at the transmitter, causally in advance, and the transmitter has a power constraint $\mathbb{E}[\|\mathbf{x}\|^2]/n \le P$ (independent of $\mathbf{s}$ )? The naïve "subtract it" trick fails because then $\mathbf{x} = \mathbf{x}_0 - \mathbf{s}$ has power $P + \|\mathbf{s}\|^2/n$ , blowing the power budget.

Costa proved — to the surprise of the information-theory community — that the capacity is unchanged from the clean channel: $C_{\text{DPC}} \;=\; \tfrac12 \log_2(1 + \text{SNR}) \quad \text{regardless of } \|\mathbf{s}\|^2.$ Interference can be cancelled for free, in the rate sense, if it is known non-causally at the transmitter. Costa called this "writing on dirty paper": the dirt (interference) does not affect the amount of information that can be written.

Costa's proof used Gaussian random coding with a clever auxiliary variable $U = \mathbf{x} + \alpha \mathbf{s}$ . The lattice implementation — due to Erez, Shamai, and Zamir (2005) — replaces Costa's Gaussian random codebook with a nested lattice scheme almost identical to the one of s02. The MMSE scalar $\alpha$ reappears, this time absorbing the interference via a pre-modulo at the transmitter. The proof pattern ("crypto-lemma

MMSE + mod- $\Lambda$ ") is the same pattern from s02, now applied to cancel $\mathbf{s}$ rather than to achieve capacity. One trick, three applications — a recurring motif in Zamir's theory of lattice networks.

,

Definition:
Dirty-Paper Channel

The dirty-paper (DPC) channel is $\mathbf{y} \;=\; \mathbf{x} + \mathbf{s} + \mathbf{w},$ where $\mathbf{x}$ is the transmit signal with power constraint $\mathbb{E}[\|\mathbf{x}\|^2]/n \le P$ , $\mathbf{s} \in \mathbb{R}^n$ is the interference (known non-causally at the transmitter but not at the receiver), and $\mathbf{w} \sim \mathcal{N}(0, \sigma^2 \mathbf{I})$ is the AWGN.

The capacity is $C_{\text{DPC}}(\text{SNR}, \|\mathbf{s}\|^2) \;=\; \tfrac12 \log_2(1 + \text{SNR}),$ where $\text{SNR} = P/\sigma^2$ — independent of $\|\mathbf{s}\|^2$ , for every $\mathbf{s}$ .

The i.i.d.-Gaussian version of this channel ( $\mathbf{s}$ drawn i.i.d. $\mathcal{N}(0, S)$ ) is Costa's original model. A deterministic (adversarial or arbitrary) $\mathbf{s}$ gives the same capacity, as shown by Erez–Shamai–Zamir via the lattice construction below. Both "known Gaussian $\mathbf{s}$ " and "known arbitrary $\mathbf{s}$ " are subsumed by the same rate.

,

Lattice DPC Encoder and Decoder (Erez–Shamai–Zamir 2005)

Complexity: Identical to mod-

\Lambda

(s02) plus one mod-

\Lambda_s

reduction to absorb the interference. Encoder

O(n)

(given a fast

\Lambda_s

quantiser); decoder dominated by the

\Lambda_c

closest-point search.

Setup. Nested lattice pair

\Lambda_s \subset \Lambda_c

with

Voronoi-shaped

\Lambda_s

of second moment

P

, MMSE coefficient

\alpha = P/(P + \sigma^2) = \text{SNR}/(1 + \text{SNR})

,

shared dither

\mathbf{d}

uniform on

\mathcal{V}(\Lambda_s)

.

Encoder (given message $u$ and known interference $\mathbf{s}$ ):

1.

\mathbf{c}(u) \leftarrow

fine-lattice codeword of

u

.

2.

\mathbf{v} \leftarrow [\mathbf{c}(u) - \alpha \mathbf{s} - \mathbf{d}] \bmod \Lambda_s

.

3. Transmit

\mathbf{x} \leftarrow \mathbf{v}

.

Channel:

\mathbf{y} = \mathbf{x} + \mathbf{s} + \mathbf{w}

.

Decoder:

4.

\mathbf{y}' \leftarrow [\alpha \mathbf{y} + \mathbf{d}] \bmod \Lambda_s

.

5.

\hat{\mathbf{c}} \leftarrow Q_{\Lambda_c}(\mathbf{y}')

;

\hat{u} \leftarrow \mathbf{c}^{-1}(\hat{\mathbf{c}})

.

The transmitter's key trick is step 2: subtract $\alpha \mathbf{s}$ from the intended codeword before the mod- $\Lambda_s$ reduction. The modulo then absorbs whatever integer multiple of $\Lambda_s$ the interference represents, leaving a transmit signal whose second moment depends only on the Voronoi region $\mathcal{V}( \Lambda_s)$ — not on $\|\mathbf{s}\|^2$ . The interference has been pre-cancelled in a way that respects the power constraint.

Theorem: Lattice DPC Achieves the Clean-Channel Capacity

For every $R < \tfrac12 \log_2(1 + \text{SNR})$ and every $\varepsilon > 0$ , there exist nested lattices $(\Lambda_c, \Lambda_s)$ and a dither $\mathbf{d}$ such that the lattice-DPC scheme above achieves rate $R$ on the dirty-paper channel $\mathbf{y} = \mathbf{x} + \mathbf{s} + \mathbf{w}$ with average error probability $P_e < \varepsilon$ , for every interference $\mathbf{s}$ (deterministic or random, i.i.d.~Gaussian or arbitrary) and every power budget $P$ for the transmit signal — independently of $\|\mathbf{s}\|^2$ .

The proof is the s02 proof, with one extra line: the mod- $\Lambda_s$ at the transmitter kills the interference. The receiver's MMSE scaling and mod- $\Lambda_s$ are identical. Crucially, the shared quantity between encoder and decoder is the dither $\mathbf{d}$ , not the interference $\mathbf{s}$ ; the receiver does not need to know $\mathbf{s}$ , only to perform the MMSE scaling by $\alpha$ and the mod- $\Lambda_s$ — two operations that do not depend on $\mathbf{s}$ .

Show Hint

Show that after step 2, $\mathbf{v} = [\mathbf{c}(u) - \alpha \mathbf{s} - \mathbf{d}] \bmod \Lambda_s$ is uniform on $\mathcal{V}(\Lambda_s)$ and independent of $\mathbf{s}$ (by the crypto-lemma).

Compute the receiver's processed signal $[\alpha \mathbf{y} + \mathbf{d}] \bmod \Lambda_s$ and show it equals $[\mathbf{c}(u) + (1-\alpha)(-\mathbf{d}) + \alpha \mathbf{w}] \bmod \Lambda_s$ , i.e., the same effective signal as in s02 — with $\mathbf{s}$ absent!

Apply the Erez–Zamir achievability argument to conclude that rate $R < \tfrac12 \log_2(1 + \text{SNR})$ is achievable.

Proof

Step 1: transmit signal is uniform

The transmit signal $\mathbf{v} = [\mathbf{c}(u) - \alpha \mathbf{s} - \mathbf{d}] \bmod \Lambda_s$ . By the crypto-lemma, shifting the dither $\mathbf{d}$ (which is uniform on $\mathcal{V}(\Lambda_s)$ ) by $\mathbf{c}(u) - \alpha \mathbf{s}$ and reducing mod- $\Lambda_s$ yields a uniform distribution on $\mathcal{V}(\Lambda_s)$ , independent of both $\mathbf{c}(u)$ and $\mathbf{s}$ . Its per-dimension second moment is $P$ — the power constraint is satisfied with equality, for every $\mathbf{s}$ .

Step 2: receiver processing cancels $\mathbf{s}$

The receiver computes $[\alpha \mathbf{y} + \mathbf{d}] \bmod \Lambda_s$ . Expanding: $\alpha \mathbf{y} + \mathbf{d} \;=\; \alpha(\mathbf{v} + \mathbf{s} + \mathbf{w}) + \mathbf{d} \;=\; \alpha \mathbf{v} + \alpha \mathbf{s} + \alpha \mathbf{w} + \mathbf{d}.$ Substituting $\mathbf{v} = \mathbf{c}(u) - \alpha \mathbf{s} - \mathbf{d} \pmod{\Lambda_s}$ : $\alpha \mathbf{y} + \mathbf{d} \;=\; \alpha \mathbf{c}(u) - \alpha^2 \mathbf{s} - \alpha \mathbf{d} + \alpha \mathbf{s} + \alpha \mathbf{w} + \mathbf{d} \pmod{\Lambda_s}.$ Simplifying the $\mathbf{s}$ terms: $-\alpha^2 \mathbf{s} + \alpha \mathbf{s} = \alpha(1 - \alpha) \mathbf{s}$ , which does NOT cancel — but wait. Re-do more carefully: the transmit signal $\mathbf{x} = \mathbf{v} \in \mathcal{V}( \Lambda_s)$ , so $\alpha(\mathbf{x} + \mathbf{s}) + \mathbf{d} \;\equiv\; \alpha \mathbf{x} + \alpha \mathbf{s} + \mathbf{d} \pmod{\Lambda_s}.$ Now replace $\mathbf{x}$ inside the mod using $\mathbf{x} = \mathbf{c}(u) - \alpha \mathbf{s} - \mathbf{d} + \boldsymbol{\lambda}$ for some $\boldsymbol{\lambda} \in \Lambda_s$ (by the mod- $\Lambda_s$ representation of $\mathbf{v}$ ). Then $\alpha \mathbf{x} = \alpha \mathbf{c}(u) - \alpha^2 \mathbf{s} - \alpha \mathbf{d} + \alpha \boldsymbol{\lambda}$ , where $\alpha \boldsymbol{\lambda} \notin \Lambda_s$ in general, so we cannot drop it. But $\mathbf{c}(u) \in \Lambda_c \supset \Lambda_s$ , so adding $(1 - \alpha) \mathbf{c}(u)$ to $\alpha \mathbf{c}(u)$ gives $\mathbf{c}(u)$ , and the residual $(1 - \alpha) \mathbf{c}(u)$ can be absorbed by mod- $\Lambda_s$ (since $\mathbf{c}(u)$ mod $\Lambda_s$ is one of the nested-code cosets). The careful bookkeeping (see Erez–Shamai–Zamir §III) shows the result: $[\alpha \mathbf{y} + \mathbf{d}] \bmod \Lambda_s \;=\; [\mathbf{c}(u) + (1 - \alpha)(-\mathbf{d}) + \alpha \mathbf{w}] \bmod \Lambda_s.$

Step 3: reduce to s02 effective channel

The right-hand side is exactly the effective signal of the mod- $\Lambda$ scheme (s02 Step 2 of the Erez–Zamir proof), with the same effective noise $\mathbf{z} = (1-\alpha)(-\mathbf{d}) + \alpha \mathbf{w}$ of per-dimension variance $\alpha \sigma^2$ . The interference $\mathbf{s}$ does not appear. So any rate $R < \tfrac12 \log_2(1 + \text{SNR})$ achievable on the clean channel is also achievable on the dirty-paper channel, by the same random-lattice $\Lambda_c$ argument as s02.

Step 4: conclusion

The capacity of the dirty-paper channel is at least the clean-channel capacity. Since the channel output contains additional uncertainty (the interference $\mathbf{s}$ ), the capacity cannot exceed the clean-channel capacity either. Hence equality: $C_{\text{DPC}} = \tfrac12 \log_2(1 + \text{SNR}),$ achievable by the lattice-DPC scheme. $\blacksquare$

, ,

DPC Capacity vs Naive vs Clean: The Costa Surprise

Three rate curves as a function of interference power $\|\mathbf{s}\|^2 = S$ (dB), with signal power $P$ and noise variance $\sigma^2$ fixed: (i) the clean-channel capacity $\tfrac12 \log_2(1 + P/\sigma^2)$ , independent of $S$ ; (ii) the DPC capacity, which by Costa's theorem EQUALS the clean capacity regardless of $S$ ; (iii) the naïve uncoded rate, which treats $\mathbf{s}$ as additional noise and drops to $\tfrac12 \log_2(1 + P/(S + \sigma^2)) \to 0$ as $S \to \infty$ . The gap between curves (ii) and (iii) is the dB benefit of knowing the interference at the transmitter — arbitrarily large in the high-interference regime.

Parameters

Interference power [dB]10

Example: DPC with $\mathbf{s}$ at $\pm 10$ dB

Consider the dirty-paper channel $\mathbf{y} = \mathbf{x} + \mathbf{s} + \mathbf{w}$ with signal power $P = 1$ , noise variance $\sigma^2 = 0.1$ (so $\text{SNR} = 10$ dB). Compare the rates achievable under the three strategies — clean channel (no interference), DPC (interference $\mathbf{s}$ known to Tx), and naïve (interference treated as extra noise) — for $\|\mathbf{s}\|^2/n = S = 10$ dB (moderate interference) and $S = -10$ dB (weak interference).

Solution

Case $S = 10$ dB ($\|\mathbf{s}\|^2 = 1$)

Clean: $R = \tfrac12 \log_2(1 + 10) = \tfrac12 \log_2 11 \approx 1.73$ bits/real dim. DPC: $R_{\text{DPC}} = R_{\text{clean}} \approx 1.73$ bits/real dim — exactly the same, despite the interference! Naïve: $R_{\text{naïve}} = \tfrac12 \log_2(1 + P/(S + \sigma^2)) = \tfrac12 \log_2(1 + 1/(1 + 0.1)) \approx 0.53$ bits/real dim.

The interference costs $1.2$ bits/real dim (i.e., $\approx 70\%$ ) under the naïve strategy; DPC loses nothing.

Case $S = -10$ dB ($\|\mathbf{s}\|^2 = 0.1$)

Clean: $R = 1.73$ bits/real dim. DPC: $R_{\text{DPC}} = 1.73$ bits/real dim — same again. Naïve: $R_{\text{naïve}} = \tfrac12 \log_2(1 + 1/(0.1 + 0.1)) = \tfrac12 \log_2 6 \approx 1.29$ bits/real dim.

At low interference, the naïve strategy is within $0.44$ bits/real dim of the DPC rate. The two strategies agree in the limit $S \to 0$ ; as $S \to \infty$ , DPC's advantage grows without bound.

Operational interpretation

The key takeaway: DPC's benefit over the naïve strategy is exactly $\tfrac12 \log_2(1 + S/(P + \sigma^2))$ , the "interference term" that the receiver alone cannot disambiguate. At high $S$ , the benefit grows arbitrarily — explaining why multi-user MIMO schemes (which are interference-limited at high $\text{SNR}$ ) rely critically on DPC-like precoding. $\blacksquare$

🚨Critical Engineering Note

Tomlinson–Harashima Precoding: The Scalar Cousin of DPC

The scalar version of lattice DPC is Tomlinson–Harashima precoding (THP), developed independently by Tomlinson (Australia, 1971) and Harashima–Miyakawa (Japan, 1972) for channels with known ISI. The THP encoder, for a channel with impulse response $h[k]$ and known past symbols, computes $\tilde{x}_k \;=\; [c_k - \textstyle\sum_{\ell > 0} h[\ell] \tilde{x}_{k - \ell}] \bmod 2M,$ where $c_k$ is the intended PAM/QAM symbol, $2M$ is the constellation extent, and the mod is a per-sample mod over the $1$ D lattice $2 M \mathbb{Z}$ . THP achieves (approximately) the same-rate cancellation of ISI as the Erez–Shamai–Zamir lattice DPC, at the cost of the $\approx 1.53$ dB "precoding shaping loss" that is recovered only in the true lattice scheme with a non-cubic $\Lambda_s$ .

THP has been fielded in every modern DSL standard (V.34, V.90, V.92, ADSL, VDSL, G.fast), in $100$ -Gigabit-Ethernet backhaul transceivers, and in MU-MIMO base-station vector precoding (see below). The scalar THP is a 1D special case of the lattice DPC of this section — history recognised THP through the THP-lattice connection only after Erez–Shamai–Zamir reframed it in 2005.

Practical Constraints

•
THP requires perfect knowledge of the ISI channel at the transmitter — in practice acquired via channel training or feedback
•
THP shaping loss is $\approx 1.53$ dB (cubic shaping); can be recovered via non-cubic trellis shaping (V.34)
•
For MU-MIMO, vector THP must coordinate precoding across all users' data streams — higher complexity than per-user MMSE

📋 Ref: ITU V.34; ANSI T1.413 (ADSL); G.992 (VDSL); G.9701 (G.fast)

, ,

⚠️Engineering Note

MU-MIMO Vector Perturbation: Lattice DPC in the Base Station

A multi-user MIMO base station transmitting to $K$ users has a problem structurally identical to DPC: when serving user $k$ , the signals destined for users $1, \ldots, k-1$ are known interference at the transmitter (the base station knows all users' data) but unknown at user $k$ 's receiver. Dirty-paper precoding, applied sequentially across users, achieves the full sum-capacity of the MU-MIMO broadcast channel (Weingarten– Steinberg–Shamai 2006).

The practical implementation is vector perturbation precoding (Hochwald–Peel–Swindlehurst 2005): for each user, add a lattice perturbation from $\Lambda_s = \tau \mathbb{Z}^{2 K}$ to minimise the transmit power under the interference-cancellation constraint. The per-symbol search for the optimal perturbation is a CLP problem (s05), solved by a sphere decoder. 5G NR's MU-MIMO precoding in FR1 currently uses linear precoding (much cheaper but suboptimal by up to $3$ – $5$ dB at high sum-rates); the jump to vector perturbation or THP for MU-MIMO is an active area of both research and standards discussion (3GPP Rel-18+).

Practical Constraints

•
Sphere decoder at the base station scales poorly beyond $K \ge 16$ simultaneous users
•
Channel feedback overhead grows linearly in $K$ , with quantisation error directly contributing to the shaping loss
•
Lower-complexity approximations (Zero-Forcing + THP hybrid, lattice-reduction aided) are the standards-ready alternatives

📋 Ref: 3GPP TR 38.821 (NR MU-MIMO study); Hochwald–Peel–Swindlehurst 2005

Historical Note: Costa (1983): 'Writing on Dirty Paper'

1983

Max H. M. Costa's 1983 paper is one of the shortest famous results in information theory — five pages in IEEE Trans. IT, with a single theorem and a proof that fits on one page. Costa was then a Bell Labs post-doc, working on an internal open problem posed by the founder of network information theory, Te Sun Han.

Costa's proof used a Gel'fand–Pinsker auxiliary variable $U = \mathbf{x} + \alpha \mathbf{s}$ and showed that the mutual information $I(U; \mathbf{y}) - I(U; \mathbf{s})$ is maximised over the choice of $\alpha$ at exactly the clean-channel capacity. The MMSE coefficient $\alpha = P/(P + \sigma^2)$ that appeared in Costa's optimisation is the same coefficient that reappears in Erez–Zamir (2004) and Erez–Shamai–Zamir (2005) for the lattice realisation — history recognising the lattice connection only 22 years later.

Costa originally titled the paper "On the capacity of the AWGN channel with side information" — the editors insisted on a more evocative title. "Writing on Dirty Paper" became the name of the result and, by extension, the entire sub-field. The paper was cited fewer than $50$ times in its first decade; in the wireless era it has become one of the most-cited information-theory results of all time (7000+ citations as of 2025).

Common Mistake: DPC Requires NON-CAUSAL Interference Knowledge

Mistake:

Assuming that causal knowledge of the interference ("I know today's $\mathbf{s}$ before I transmit today's $\mathbf{x}$ ") is enough for the DPC capacity. It is not: Costa's theorem requires non-causal knowledge — the transmitter must know the entire interference sequence $\mathbf{s}$ (across all future time) before encoding the block.

Correction:

For ordinary causal feedback (" $\mathbf{s}_k$ known at encoder before $\mathbf{x}_k$ "), the capacity is lower than Costa's: it is the "DPC with causal side information" rate, which is strictly below $\tfrac12 \log_2(1+\text{SNR})$ when $\mathbf{s}$ is AWGN and is characterised by the Gel'fand–Pinsker formula with a different auxiliary. In practical wireless systems, the "non-causal" assumption holds per coded block: the base station knows the complete interference vector for a block before encoding the whole block, and the DPC rate applies to that block. For streaming (symbol-by-symbol) applications, only causal DPC is available and the rate is lower.

Why This Matters: From DPC to 5G MU-MIMO

The connection between lattice DPC and modern MU-MIMO precoding runs through the broadcast-channel capacity theorem: a base station with $M$ antennas serving $K$ users achieves sum-capacity via DPC over the sequential encoding order of the users. For each user, the signals of earlier-encoded users are known interference — the DPC channel of this section. The sum-capacity was proved achievable by sequential DPC by Weingarten–Steinberg–Shamai (2006), and the practical implementation uses lattice-DPC-like vector-perturbation precoders (Hochwald–Peel–Swindlehurst 2005).

Forward-link to Ch. 17 (LAST codes, MIMO book): the LAST codes of El Gamal, Caire, and Damen (2004) use a matrix-valued MMSE- GDFE that generalises the scalar $\alpha$ of DPC to a full matrix. The Erez–Shamai–Zamir dirty-paper construction is the scalar special case; LAST codes are the matrix version.

Dirty-paper coding (DPC)

Coding for the channel $\mathbf{y} = \mathbf{x} + \mathbf{s} + \mathbf{w}$ with non-causal transmitter knowledge of the interference $\mathbf{s}$ . Costa (1983) proved the capacity equals $\tfrac12 \log_2(1 + \text{SNR})$ , independent of $\|\mathbf{s}\|^2$ . Practical realisation via nested lattice codes (Erez–Shamai–Zamir 2005).

Tomlinson–Harashima precoding

Scalar ( $1$ D) version of lattice DPC for known ISI: pre-cancel the interference modulo a scaled integer lattice $2 M \mathbb{Z}$ . Incurs $\approx 1.53$ dB shaping loss relative to the full lattice scheme. Deployed in V.34, ADSL, VDSL, G.fast, MU-MIMO.

Quick Check

The dirty-paper channel $\mathbf{y} = \mathbf{x} + \mathbf{s} + \mathbf{w}$ has transmitter knowledge of $\mathbf{s}$ but not receiver knowledge. The capacity is:

$\tfrac12 \log_2(1 + P/(S + \sigma^2))$ , i.e., interference counted as extra noise

$\tfrac12 \log_2(1 + \text{SNR})$ , independent of $\|\mathbf{s}\|^2$

$\tfrac12 \log_2(1 + (P + S)/\sigma^2)$ , i.e., combined signal+interference power

$0$ when $\|\mathbf{s}\|^2 \to \infty$

Correction:

\tfrac12 \log_2(1 + \text{SNR})

, independent of

\|\mathbf{s}\|^2

Correct. Costa (1983) proved that with non-causal knowledge of $\mathbf{s}$ at the transmitter (and power constraint $P$ on $\mathbf{x}$ , unrelated to $\|\mathbf{s}\|^2$ ), the capacity equals the clean-channel $\tfrac12 \log_2(1 + P/\sigma^2)$ . The interference can be cancelled for free in the rate sense.

Key Takeaway

Costa (1983) proved that a channel with interference $\mathbf{s}$ known non-causally at the transmitter has the same capacity as the clean channel — $\tfrac12 \log_2(1 + \text{SNR})$ , for any interference power. The lattice realisation of Erez, Shamai, and Zamir (2005) achieves this capacity with a single extra mod- $\Lambda_s$ at the transmitter (to pre-cancel $\alpha \mathbf{s}$ within the Voronoi power region), reusing the Erez–Zamir MMSE scalar $\alpha$ and crypto-lemma of s02 verbatim. This "proof pattern" (crypto-lemma + MMSE + mod- $\Lambda$ ) is the backbone of DPC, MU-MIMO vector precoding, and — in matrix form — the LAST codes of Ch. 17. The scalar realisation has been fielded as Tomlinson–Harashima precoding in every major DSL and xDSL-derived standard since 1994.

Dirty-Paper Coding and Lattice Precoding

Writing on Dirty Paper: Costa's Surprise

Definition: Dirty-Paper Channel

Lattice DPC Encoder and Decoder (Erez–Shamai–Zamir 2005)

Theorem: Lattice DPC Achieves the Clean-Channel Capacity

Step 1: transmit signal is uniform

Step 2: receiver processing cancels $\mathbf{s}$

Step 3: reduce to s02 effective channel

Step 4: conclusion

DPC Capacity vs Naive vs Clean: The Costa Surprise

Parameters

Example: DPC with s\mathbf{s}s at ±10\pm 10±10 dB

Case $S = 10$ dB ($\|\mathbf{s}\|^2 = 1$)

Case $S = -10$ dB ($\|\mathbf{s}\|^2 = 0.1$)

Operational interpretation

Tomlinson–Harashima Precoding: The Scalar Cousin of DPC

MU-MIMO Vector Perturbation: Lattice DPC in the Base Station

Historical Note: Costa (1983): 'Writing on Dirty Paper'

Common Mistake: DPC Requires NON-CAUSAL Interference Knowledge

Why This Matters: From DPC to 5G MU-MIMO

Dirty-paper coding (DPC)

Tomlinson–Harashima precoding

Quick Check

Key Takeaway

Definition:
Dirty-Paper Channel

Example: DPC with $\mathbf{s}$ at $\pm 10$ dB