Hypothesis Testing and MAP/ML Detection

Why Detection Theory?

Chapter 8 introduced the ML detector as the minimum-distance rule. Now we develop the rigorous statistical framework behind that result. Detection theory answers the question: given a noisy observation yy, which of MM possible messages was transmitted? The answer depends on whether we have prior knowledge (MAP) or not (ML), and the mathematical structure of the optimal detector reveals when and why the minimum-distance rule is optimal.

Definition:

Binary Hypothesis Test

A binary hypothesis test is a decision between two hypotheses based on an observation yy:

H0:y=s0+w(null hypothesis)H_0: y = s_0 + w \qquad \text{(null hypothesis)} H1:y=s1+w(alternative hypothesis)H_1: y = s_1 + w \qquad \text{(alternative hypothesis)}

where s0s_0 and s1s_1 are known signals and ww is noise with known distribution (typically wN(0,σ2)w \sim \mathcal{N}(0, \sigma^2)).

A decision rule δ(y)\delta(y) partitions the observation space into two regions:

  • R0\mathcal{R}_0: decide H0H_0 (the acceptance region)
  • R1\mathcal{R}_1: decide H1H_1 (the rejection region)

Two types of errors arise:

  • Type I error (false alarm): deciding H1H_1 when H0H_0 is true, with probability PFA=P(δ=H1H0)P_{\text{FA}} = P(\delta = H_1 \mid H_0)

  • Type II error (miss): deciding H0H_0 when H1H_1 is true, with probability PM=P(δ=H0H1)P_{\text{M}} = P(\delta = H_0 \mid H_1)

The detection probability is PD=1PMP_{\text{D}} = 1 - P_{\text{M}}.

In digital communications, H0H_0 and H1H_1 correspond to the two transmitted symbols. The false alarm and miss probabilities are the conditional error probabilities P(es0)P(e \mid s_0) and P(es1)P(e \mid s_1).

,

Definition:

Likelihood Ratio Test (LRT)

The likelihood ratio for observation yy is

Λ(y)=p(yH1)p(yH0)\Lambda(y) = \frac{p(y \mid H_1)}{p(y \mid H_0)}

A likelihood ratio test (LRT) takes the form

Λ(y)H1H0η\Lambda(y) \underset{H_0}{\overset{H_1}{\gtrless}} \eta

where η\eta is a threshold that depends on the optimality criterion. Equivalently, using the log-likelihood ratio:

lnΛ(y)=lnp(yH1)lnp(yH0)H1H0lnη\ln \Lambda(y) = \ln p(y \mid H_1) - \ln p(y \mid H_0) \underset{H_0}{\overset{H_1}{\gtrless}} \ln \eta

The LRT is the most general form of an optimal binary detector; every optimal decision rule can be expressed as an LRT with an appropriate choice of threshold η\eta.

Theorem: Neyman-Pearson Lemma

Among all decision rules with false alarm probability PFAαP_{\text{FA}} \leq \alpha (for a given α(0,1)\alpha \in (0, 1)), the likelihood ratio test

Λ(y)=p(yH1)p(yH0)H1H0η\Lambda(y) = \frac{p(y \mid H_1)}{p(y \mid H_0)} \underset{H_0}{\overset{H_1}{\gtrless}} \eta

with threshold η\eta chosen so that PFA=αP_{\text{FA}} = \alpha maximises the detection probability PDP_{\text{D}}.

That is, the LRT is the most powerful test at level α\alpha: no other test with PFAαP_{\text{FA}} \leq \alpha can achieve a higher PDP_{\text{D}}.

The LRT decides H1H_1 for observations where H1H_1 is much more likely than H0H_0 (large Λ\Lambda). The threshold η\eta controls the trade-off between false alarms and detection: lowering η\eta increases both PDP_{\text{D}} and PFAP_{\text{FA}}.

,

Definition:

MAP Detection Rule

The maximum a-posteriori (MAP) detection rule minimises the total probability of error by choosing the hypothesis with the largest posterior probability:

H^=argmaxi{0,1}P(Hiy)\hat{H} = \arg\max_{i \in \{0, 1\}} P(H_i \mid y)

By Bayes' theorem, P(Hiy)p(yHi)P(Hi)P(H_i \mid y) \propto p(y \mid H_i) P(H_i), so the MAP rule is equivalent to the LRT with threshold

ηMAP=P(H0)P(H1)\eta_{\text{MAP}} = \frac{P(H_0)}{P(H_1)}

That is:

Λ(y)=p(yH1)p(yH0)H1H0P(H0)P(H1)\Lambda(y) = \frac{p(y \mid H_1)}{p(y \mid H_0)} \underset{H_0}{\overset{H_1}{\gtrless}} \frac{P(H_0)}{P(H_1)}

For the M-ary case with MM hypotheses:

m^=argmaxm{1,,M}p(yHm)P(Hm)\hat{m} = \arg\max_{m \in \{1, \ldots, M\}} p(y \mid H_m)\, P(H_m)

The MAP rule uses prior probabilities P(Hi)P(H_i) as side information. When the priors are unequal, the decision boundary shifts toward the less likely hypothesis, reducing the overall error probability.

,

Definition:

ML Detection Rule

The maximum-likelihood (ML) detection rule is the MAP rule with uniform priors P(H0)=P(H1)=1/2P(H_0) = P(H_1) = 1/2:

H^=argmaxi{0,1}p(yHi)\hat{H} = \arg\max_{i \in \{0, 1\}} p(y \mid H_i)

The ML threshold is ηML=1\eta_{\text{ML}} = 1, so the LRT becomes

p(yH1)H1H0p(yH0)p(y \mid H_1) \underset{H_0}{\overset{H_1}{\gtrless}} p(y \mid H_0)

For the AWGN channel with p(yHi)exp ⁣(ysi2/N0)p(y \mid H_i) \propto \exp\!\left(-\|y - s_i\|^2 / N_0\right), the ML rule reduces to the minimum-distance rule:

m^=argminmysm2\hat{m} = \arg\min_{m} \|y - s_m\|^2

which is the result derived geometrically in Chapter 8.

The ML detector is optimal (in the sense of minimising PeP_e) only when all hypotheses are equally likely. In practice, source coding typically makes all bit patterns approximately equally likely, so the ML detector is widely used.

Theorem: Union Bound on Error Probability

For M-ary detection in AWGN with signal set {s1,,sM}\{\mathbf{s}_1, \ldots, \mathbf{s}_M\}, the symbol error probability of the ML detector is upper bounded by

Psm=1Mk=1kmMP(Hm)1Q ⁣(dmk2σ)=1Mm=1Mk=1kmMQ ⁣(dmk2N0)P_s \leq \sum_{m=1}^{M} \sum_{\substack{k=1 \\ k \neq m}}^{M} \frac{P(H_m)}{1}\, Q\!\left(\frac{d_{mk}}{2\sigma}\right) = \frac{1}{M} \sum_{m=1}^{M} \sum_{\substack{k=1 \\ k \neq m}}^{M} Q\!\left(\frac{d_{mk}}{\sqrt{2N_0}}\right)

where dmk=smskd_{mk} = \|\mathbf{s}_m - \mathbf{s}_k\| is the Euclidean distance between signals mm and kk, and σ2=N0/2\sigma^2 = N_0/2.

At high SNR, the bound is dominated by the nearest-neighbour terms:

PsNminQ ⁣(dmin2N0)P_s \approx N_{\min}\, Q\!\left(\frac{d_{\min}}{\sqrt{2N_0}}\right)

where NminN_{\min} is the average number of nearest neighbours at minimum distance dmind_{\min}.

The probability of confusing sm\mathbf{s}_m with sk\mathbf{s}_k in a pairwise test is Q(dmk/2N0)Q(d_{mk}/\sqrt{2N_0}). The union bound sums these pairwise error probabilities. It overestimates PsP_s because error events can overlap (one noise realisation might cause confusion with multiple symbols), but at high SNR, the dominant error event is a single nearest-neighbour confusion.

Example: Binary Detection in AWGN (BPSK)

A BPSK system transmits s0=Ebs_0 = -\sqrt{E_b} (for bit 0) and s1=+Ebs_1 = +\sqrt{E_b} (for bit 1) over an AWGN channel with noise variance σ2=N0/2\sigma^2 = N_0/2. The received signal is r=sm+wr = s_m + w where wN(0,N0/2)w \sim \mathcal{N}(0, N_0/2).

(a) Derive the ML decision rule.

(b) Compute the bit error probability.

(c) Find the MAP decision rule when P(H1)=0.7P(H_1) = 0.7.

Example: M-ary Detection (QPSK Decision Regions)

A QPSK constellation has signal points

sm=Es[cos(2πm/4+π/4),  sin(2πm/4+π/4)]T,m=0,1,2,3\mathbf{s}_m = \sqrt{E_s}\, [\cos(2\pi m/4 + \pi/4),\; \sin(2\pi m/4 + \pi/4)]^T, \quad m = 0, 1, 2, 3

(a) Find the ML decision regions.

(b) Compute the symbol error probability.

(c) Show that Pb=Q(2Eb/N0)P_b = Q(\sqrt{2E_b/N_0}) with Gray mapping.

ML vs MAP Decision Regions

Explore how prior probabilities shift the decision boundaries. With equal priors (ML), boundaries are perpendicular bisectors of the segment between signal points. As the prior probability of H0H_0 increases, the MAP boundary shifts toward H1H_1 (expanding the H0H_0 region). Observe how the overall error probability changes with the prior.

Parameters
0.5
10

MAP vs ML Decision Boundary Animation

Watch how the MAP decision boundary shifts as the prior probability P(H0)P(H_0) sweeps from 0.1 to 0.9. When P(H0)>0.5P(H_0) > 0.5, the boundary shifts toward H1H_1 (expanding R0\mathcal{R}_0). At P(H0)=0.5P(H_0) = 0.5, the MAP boundary coincides with the ML boundary.
The MAP decision boundary (green) shifts relative to the fixed ML boundary (yellow) as the prior P(H0)P(H_0) varies.

Quick Check

A binary communication system has P(H0)=0.9P(H_0) = 0.9 and P(H1)=0.1P(H_1) = 0.1. Compared to the ML detector, the MAP detector will:

Shift the decision boundary toward s1s_1, reducing overall PeP_e

Keep the same decision boundary as ML

Shift the decision boundary toward s0s_0, reducing overall PeP_e

Always achieve lower PeP_e than ML regardless of the priors

Common Mistake: Forgetting Prior Probabilities in MAP Detection

Mistake:

Using the ML decision rule (minimum distance) when the transmitted symbols have significantly unequal prior probabilities, and expecting minimum overall error probability.

Correction:

The ML detector is optimal only for uniform priors. When priors are unequal, the MAP detector shifts the decision boundary toward the less likely symbol and achieves strictly lower total PeP_e.

In practice, source coding and scrambling make the bit probabilities approximately P(0)P(1)0.5P(0) \approx P(1) \approx 0.5, so ML is usually a good approximation. However, in systems with unequal symbol probabilities (e.g., probabilistic constellation shaping), using ML instead of MAP wastes the shaping gain.

Historical Note: Neyman and Pearson

1933

Jerzy Neyman and Egon Pearson published their fundamental lemma on optimal hypothesis testing in 1933, establishing that the likelihood ratio test is the most powerful test at any given significance level. Their work, originally motivated by problems in biological statistics, became the foundation of statistical decision theory and was later adopted by the radar and communications communities in the 1940s-50s. The Neyman-Pearson framework directly influenced the development of optimal receiver design by Woodward, Middleton, and Van Trees.

Historical Note: From Bayes to MAP Detection

1763-1961

Thomas Bayes' posthumously published theorem (1763) on inverse probability lay dormant for nearly two centuries before becoming central to statistical decision theory in the 1950s. The MAP detection rule — choose the hypothesis with the largest posterior probability — is a direct application of Bayes' theorem. Abraham Wald's sequential analysis (1947) and the subsequent development of Bayesian decision theory by Raiffa and Schlaifer (1961) established the framework used in modern communications.

Hypothesis Test

A statistical decision procedure that selects one of two or more hypotheses based on observed data. In digital communications, each hypothesis corresponds to a possible transmitted symbol, and the test determines the detected symbol.

Related: Likelihood Ratio Test (LRT), Map Detection, Ml Detection

Likelihood Ratio

The ratio Λ(y)=p(yH1)/p(yH0)\Lambda(y) = p(y \mid H_1) / p(y \mid H_0) of the conditional densities of the observation under the two hypotheses. Every optimal detection rule can be expressed as a comparison of the likelihood ratio against a threshold.

Related: Binary Hypothesis Test, Neyman-Pearson Lemma, Map Detection