Pairwise Error Probability and the Coding-Gain Criterion
From Distance to Design Criterion
We have seen that the gap to capacity in the bandwidth-limited regime is mostly a coding deficit. The question is: how do we design the constellation points to recover it?
The answer, at high SNR, comes from the pairwise error probability: the probability that the receiver chooses a wrong codeword when the true one is . This probability depends only on the Euclidean distance and the noise variance β not on the structure of the constellation. Summing over all wrong codewords via the union bound then translates a geometric property of the point set into an asymptotic error exponent. The design criterion that falls out is the starting point for Ungerboeck's TCM in Chapter 2 and for the space-time coding criteria in Chapter 10.
Definition: Pairwise Error Probability on AWGN
Pairwise Error Probability on AWGN
Let be two candidate codewords and let with . The pairwise error probability (PEP) is the probability that the maximum-likelihood receiver, given the choice between and , prefers :
where is the error vector and . Equivalently,
The PEP depends on and only through . This is a distinctive feature of the AWGN channel: the direction of the error vector is irrelevant, and only its length matters. Over fading channels this is no longer true, and the PEP will acquire additional structure (see Chapter 10).
Theorem: Pairwise Error Probability on the AWGN Channel
For transmitted over the AWGN channel with noise variance per real dimension, the ML pairwise error probability is
where the upper bound uses the Chernoff inequality .
Project the noise onto the direction : the component along is zero-mean Gaussian with variance . The ML detector picks when this component exceeds β a one-dimensional Gaussian tail. This is why only enters the PEP: the orthogonal noise components cannot cause the pair error.
Project the received vector onto the direction .
Show that the ML decision between and depends only on this projection.
Use the fact that the projection of a spherically symmetric Gaussian onto any unit vector is a scalar Gaussian of the same variance.
Define the decision statistic
The ML detector chooses over iff . Expanding both sides:
which simplifies to , or .
Substitute $\mathbf{y} = \mathbf{x} + \ntn{noise}$
Plugging in and using (after expanding ), the condition becomes
i.e., , or equivalently .
Evaluate the Gaussian tail
The scalar is a linear combination of independent variables, hence is itself . Therefore
Applying the Chernoff bound gives the stated exponential form.
Pairwise Geometry: Decision Boundary and Noise Circle
Two constellation points at Euclidean distance in , with a noise cloud centered on the transmitted point. The decision boundary is the perpendicular bisector; a pair error occurs whenever the noise pushes across it. As the distance grows, the boundary moves away and the tail probability shrinks exponentially.
Parameters
Definition: Minimum Distance and Asymptotic Coding Gain
Minimum Distance and Asymptotic Coding Gain
Let be a finite constellation of equiprobable signal vectors of average energy per 2D dimension. Its minimum Euclidean distance is
The normalized minimum distance is , which is invariant under scaling of the constellation. The asymptotic coding gain of over a baseline constellation of the same spectral efficiency is
The asymptotic coding gain depends on the constellation only through the ratio . Rescaling the constellation does not change β this is a sensible invariance: if you double every point, you double the distances and quadruple the energy, leaving the ratio alone.
Theorem: Union Bound on Error Probability
For a constellation transmitted over AWGN with ML decoding, the codeword-error probability satisfies the union bound
At high SNR the sum is dominated by pairs at minimum distance. Let be the average multiplicity of nearest neighbors (per codeword). Then
which makes the dominant design parameter at high SNR.
Every wrong codeword contributes a term in the union bound, but at high SNR the smallest distances dominate because the function is exponentially decreasing. The minimum-distance term has the largest multiplier per codeword, and everything else is exponentially smaller. At low SNR the full distance spectrum matters, and the union bound can be quite loose β a recurring theme in code performance analysis.
For each transmitted codeword, bound the probability that the ML detector picks any incorrect codeword by the union bound over all incorrect choices.
Average over the uniform choice of transmitted codeword.
At high SNR, argue that the dominant terms are those with and identify .
Per-codeword union bound
Conditioned on transmitted, a codeword error occurs iff the ML detector picks some . The union bound gives
Average over equiprobable codewords
Averaging over uniformly over gives
High-SNR behavior
Split the double sum into contributions from pairs at distance (call this count , so that the per-codeword average is ) and pairs at larger distances. The ratio of a -term to the -term under the Chernoff bound is , which vanishes exponentially as . Therefore at high SNR
which is the stated asymptotic.
Example: BPSK and QPSK Have the Same Asymptotic Coding Gain
Compute for BPSK (points ) and QPSK (points ). Interpret the result.
BPSK
BPSK has two points at , so and . Hence .
QPSK
QPSK has four points at . The minimum distance is between adjacent points (e.g., and ), which is so . The energy is (both coordinates squared summed). Hence .
Normalize by dimension
BPSK uses one real dimension per symbol with . QPSK uses two real dimensions with , which is per real dimension. Normalizing by dimension, both schemes have the same coding gain per real dimension β which is why the -vs- curves of BPSK and QPSK coincide at high SNR. QPSK just packs two BPSK signals orthogonally onto one complex dimension; it is the same code geometry, doubled up.
Brute-Force Minimum Distance Computation
Complexity: for a constellation of points inFor small constellations () this is fine. For larger constellations, structured codes (lattices, trellis codes) admit decomposition-based algorithms that bypass the quadratic cost.
Hamming Distance β Euclidean Distance
Binary coding is dominated by the Hamming distance of the code: the minimum number of coordinate positions in which two codewords differ. Coded modulation is dominated by the Euclidean distance of the transmitted constellation: the minimum distance between two transmitted signal vectors.
These are not the same thing. Two binary codewords differing in many positions may map to two QAM sequences that happen to be close in signal space, if the QAM labeling is poorly chosen. The Ungerboeck insight (Chapter 2) is precisely that the code must be designed to maximize of the transmitted signal, not of the binary label β and that this requires a coordinated choice of code and labeling.
Common Mistake: Union bound is loose at low SNR
Mistake:
Assuming that the union-bound expression is accurate at all SNRs.
Correction:
The union bound is asymptotically tight as SNR . At low SNR (where the error probability is large), the bound can exceed 1, and the full distance spectrum contributes meaningfully. Improved bounds (Gallager, Poltyrev, Divsalar) exist but are more intricate; in practice the first-order union-bound expression guides design but must be validated by simulation at the operating SNR of interest.
The Role of in Modern Standards
Design tables for DVB-S2X, 5G NR LDPC codes, and Wi-Fi 6/7 explicitly report the minimum Euclidean distance between coded QAM sequences, not just the minimum Hamming distance of the binary LDPC code. For high-rate QAM constellations, the effective Euclidean distance multiplier of the code-plus-labeling combination is the quantity that sets the operating SNR. Gray labeling achieves close to the best effective among binary labelings; set-partitioning labeling (Ungerboeck) does better when coupled with a matched code.
- β’
5G NR uses QPSK / 16-QAM / 64-QAM / 256-QAM with Gray labels in the base tables; 1024-QAM was added in Rel-17 for IAB
- β’
DVB-S2X uses APSK with specially-designed labels that are approximately Gray on the amplitude bits and rotationally structured on the phase bits
- β’
LDPC designers tune the degree distribution to match a given QAM order so as to maximize effective at the target error rate
Quick Check
If you double the minimum Euclidean distance of a constellation at fixed average energy, by roughly how many dB does the high-SNR error probability improve, asymptotically?
3 dB
6 dB
10 dB
It depends on the multiplicity .
Pairwise error probability is . Doubling quadruples its square, which is a 6 dB improvement in the effective SNR. This is why every factor of in minimum distance is called "3 dB of coding gain" and every factor of 2 is "6 dB."
Key Takeaway
At high SNR, the minimum Euclidean distance is the design parameter. The pairwise error probability depends only on , and the union bound shows that . The coding-gain design criterion on AWGN is therefore: maximize at the target spectral efficiency. This is the criterion that Ungerboeck's TCM optimizes; it will generalize to a determinant-and-rank criterion on fading MIMO channels (Chapter 10).
Pairwise error probability (PEP)
The probability that the ML detector, given a binary choice between two candidate codewords, picks the wrong one under AWGN: , where is the error vector.
Related: Union Bound, Minimum Euclidean distance, Ml Detection
Minimum Euclidean distance
The smallest distance between any two distinct signal vectors in a finite constellation. Denoted ; the normalized quantity determines asymptotic coding gain.
Historical Note: Massey's 1974 ZΓΌrich Paper β The Seeds of Coded Modulation
1974At the 1974 ZΓΌrich Seminar on Digital Communications, James Massey gave a talk that proved seminal: he argued that the conventional separation of coding (over a binary channel) from modulation (over a Gaussian channel) was responsible for a large part of the gap to capacity in bandwidth-limited systems, and that a joint design of code and modulator could close much of that gap. The argument rested on the observation that the Hamming-distance criterion of classical coding theory was the wrong criterion when the downstream channel was AWGN with a higher-order constellation. This insight took eight years to become an engineering reality β Ungerboeck's TCM paper in 1982 β but Massey's talk is the moment the coded-modulation program was first announced.
Why This Matters: On AWGN We Optimize ; On Fading We Will Optimize Rank and Determinant
On AWGN, the PEP is : one number, the minimum distance. On a Rayleigh fading MIMO channel, the PEP depends on the full singular-value spectrum of the error matrix , and the analog of is a pair of numbers: the rank of (diversity order) and the product of its nonzero eigenvalues (coding gain). Chapter 10 will rederive the fading PEP from scratch; the structural parallel with what we have just done is worth keeping in mind.