Ferkans — Interactive Telecom Tutor

Why Detection Theory?

Chapter 8 introduced the ML detector as the minimum-distance rule. Now we develop the rigorous statistical framework behind that result. Detection theory answers the question: given a noisy observation $y$ , which of $M$ possible messages was transmitted? The answer depends on whether we have prior knowledge (MAP) or not (ML), and the mathematical structure of the optimal detector reveals when and why the minimum-distance rule is optimal.

Definition:
Binary Hypothesis Test

A binary hypothesis test is a decision between two hypotheses based on an observation $y$ :

$H_0: y = s_0 + w \qquad \text{(null hypothesis)}$ $H_1: y = s_1 + w \qquad \text{(alternative hypothesis)}$

where $s_0$ and $s_1$ are known signals and $w$ is noise with known distribution (typically $w \sim \mathcal{N}(0, \sigma^2)$ ).

A decision rule $\delta(y)$ partitions the observation space into two regions:

$\mathcal{R}_0$ : decide $H_0$ (the acceptance region)
$\mathcal{R}_1$ : decide $H_1$ (the rejection region)

Two types of errors arise:

Type I error (false alarm): deciding $H_1$ when $H_0$ is true, with probability $P_{\text{FA}} = P(\delta = H_1 \mid H_0)$
Type II error (miss): deciding $H_0$ when $H_1$ is true, with probability $P_{\text{M}} = P(\delta = H_0 \mid H_1)$

The detection probability is $P_{\text{D}} = 1 - P_{\text{M}}$ .

In digital communications, $H_0$ and $H_1$ correspond to the two transmitted symbols. The false alarm and miss probabilities are the conditional error probabilities $P(e \mid s_0)$ and $P(e \mid s_1)$ .

,

Definition:
Likelihood Ratio Test (LRT)

The likelihood ratio for observation $y$ is

$\Lambda(y) = \frac{p(y \mid H_1)}{p(y \mid H_0)}$

A likelihood ratio test (LRT) takes the form

$\Lambda(y) \underset{H_0}{\overset{H_1}{\gtrless}} \eta$

where $\eta$ is a threshold that depends on the optimality criterion. Equivalently, using the log-likelihood ratio:

$\ln \Lambda(y) = \ln p(y \mid H_1) - \ln p(y \mid H_0) \underset{H_0}{\overset{H_1}{\gtrless}} \ln \eta$

The LRT is the most general form of an optimal binary detector; every optimal decision rule can be expressed as an LRT with an appropriate choice of threshold $\eta$ .

Theorem: Neyman-Pearson Lemma

Among all decision rules with false alarm probability $P_{\text{FA}} \leq \alpha$ (for a given $\alpha \in (0, 1)$ ), the likelihood ratio test

$\Lambda(y) = \frac{p(y \mid H_1)}{p(y \mid H_0)} \underset{H_0}{\overset{H_1}{\gtrless}} \eta$

with threshold $\eta$ chosen so that $P_{\text{FA}} = \alpha$ maximises the detection probability $P_{\text{D}}$ .

That is, the LRT is the most powerful test at level $\alpha$ : no other test with $P_{\text{FA}} \leq \alpha$ can achieve a higher $P_{\text{D}}$ .

The LRT decides $H_1$ for observations where $H_1$ is much more likely than $H_0$ (large $\Lambda$ ). The threshold $\eta$ controls the trade-off between false alarms and detection: lowering $\eta$ increases both $P_{\text{D}}$ and $P_{\text{FA}}$ .

Proof

Sufficiency of the LRT

Let $\delta^*$ be the LRT with $P_{\text{FA}} = \alpha$ , and let $\delta$ be any other test with $P_{\text{FA}} \leq \alpha$ . Define the difference in detection probabilities:

$P_{\text{D}}^* - P_{\text{D}} = \int_{\mathcal{R}_1^*} p(y \mid H_1)\, dy - \int_{\mathcal{R}_1} p(y \mid H_1)\, dy$

On $\mathcal{R}_1^*$ , we have $p(y \mid H_1) \geq \eta\, p(y \mid H_0)$ by construction, so $p(y \mid H_1) - \eta\, p(y \mid H_0) \geq 0$ .

Optimality

It can be shown that

$P_{\text{D}}^* - P_{\text{D}} \geq \eta (P_{\text{FA}}^* - P_{\text{FA}}) \geq 0$

since $P_{\text{FA}} \leq \alpha = P_{\text{FA}}^*$ and $\eta > 0$ . Therefore $P_{\text{D}}^* \geq P_{\text{D}}$ for all tests $\delta$ with $P_{\text{FA}} \leq \alpha$ . $\blacksquare$

,

Definition:
MAP Detection Rule

The maximum a-posteriori (MAP) detection rule minimises the total probability of error by choosing the hypothesis with the largest posterior probability:

$\hat{H} = \arg\max_{i \in \{0, 1\}} P(H_i \mid y)$

By Bayes' theorem, $P(H_i \mid y) \propto p(y \mid H_i) P(H_i)$ , so the MAP rule is equivalent to the LRT with threshold

$\eta_{\text{MAP}} = \frac{P(H_0)}{P(H_1)}$

That is:

$\Lambda(y) = \frac{p(y \mid H_1)}{p(y \mid H_0)} \underset{H_0}{\overset{H_1}{\gtrless}} \frac{P(H_0)}{P(H_1)}$

For the M-ary case with $M$ hypotheses:

$\hat{m} = \arg\max_{m \in \{1, \ldots, M\}} p(y \mid H_m)\, P(H_m)$

The MAP rule uses prior probabilities $P(H_i)$ as side information. When the priors are unequal, the decision boundary shifts toward the less likely hypothesis, reducing the overall error probability.

,

Definition:
ML Detection Rule

The maximum-likelihood (ML) detection rule is the MAP rule with uniform priors $P(H_0) = P(H_1) = 1/2$ :

$\hat{H} = \arg\max_{i \in \{0, 1\}} p(y \mid H_i)$

The ML threshold is $\eta_{\text{ML}} = 1$ , so the LRT becomes

$p(y \mid H_1) \underset{H_0}{\overset{H_1}{\gtrless}} p(y \mid H_0)$

For the AWGN channel with $p(y \mid H_i) \propto \exp\!\left(-\|y - s_i\|^2 / N_0\right)$ , the ML rule reduces to the minimum-distance rule:

$\hat{m} = \arg\min_{m} \|y - s_m\|^2$

which is the result derived geometrically in Chapter 8.

The ML detector is optimal (in the sense of minimising $P_e$ ) only when all hypotheses are equally likely. In practice, source coding typically makes all bit patterns approximately equally likely, so the ML detector is widely used.

Theorem: Union Bound on Error Probability

For M-ary detection in AWGN with signal set $\{\mathbf{s}_1, \ldots, \mathbf{s}_M\}$ , the symbol error probability of the ML detector is upper bounded by

$P_s \leq \sum_{m=1}^{M} \sum_{\substack{k=1 \\ k \neq m}}^{M} \frac{P(H_m)}{1}\, Q\!\left(\frac{d_{mk}}{2\sigma}\right) = \frac{1}{M} \sum_{m=1}^{M} \sum_{\substack{k=1 \\ k \neq m}}^{M} Q\!\left(\frac{d_{mk}}{\sqrt{2N_0}}\right)$

where $d_{mk} = \|\mathbf{s}_m - \mathbf{s}_k\|$ is the Euclidean distance between signals $m$ and $k$ , and $\sigma^2 = N_0/2$ .

At high SNR, the bound is dominated by the nearest-neighbour terms:

$P_s \approx N_{\min}\, Q\!\left(\frac{d_{\min}}{\sqrt{2N_0}}\right)$

where $N_{\min}$ is the average number of nearest neighbours at minimum distance $d_{\min}$ .

The probability of confusing $\mathbf{s}_m$ with $\mathbf{s}_k$ in a pairwise test is $Q(d_{mk}/\sqrt{2N_0})$ . The union bound sums these pairwise error probabilities. It overestimates $P_s$ because error events can overlap (one noise realisation might cause confusion with multiple symbols), but at high SNR, the dominant error event is a single nearest-neighbour confusion.

Proof

Pairwise error probability

The probability that $\mathbf{s}_m$ is mistaken for $\mathbf{s}_k$ (ignoring all other signals) is

$P(\mathbf{s}_m \to \mathbf{s}_k) = Q\!\left(\frac{\|\mathbf{s}_m - \mathbf{s}_k\|}{2\sigma}\right) = Q\!\left(\frac{d_{mk}}{\sqrt{2N_0}}\right)$

This follows from the AWGN noise projecting onto the line between $\mathbf{s}_m$ and $\mathbf{s}_k$ .

Union bound application

The event "error given $\mathbf{s}_m$ was sent" is a subset of the union of events " $\mathbf{s}_m$ confused with $\mathbf{s}_k$ " for $k \neq m$ :

$P(e \mid \mathbf{s}_m) \leq \sum_{k \neq m} P(\mathbf{s}_m \to \mathbf{s}_k)$

Averaging over the transmitted signal:

$P_s = \sum_m P(H_m) P(e \mid \mathbf{s}_m) \leq \sum_m P(H_m) \sum_{k \neq m} Q\!\left(\frac{d_{mk}}{\sqrt{2N_0}}\right)$ $\blacksquare$

Example: Binary Detection in AWGN (BPSK)

A BPSK system transmits $s_0 = -\sqrt{E_b}$ (for bit 0) and $s_1 = +\sqrt{E_b}$ (for bit 1) over an AWGN channel with noise variance $\sigma^2 = N_0/2$ . The received signal is $r = s_m + w$ where $w \sim \mathcal{N}(0, N_0/2)$ .

(a) Derive the ML decision rule.

(b) Compute the bit error probability.

(c) Find the MAP decision rule when $P(H_1) = 0.7$ .

Solution

ML decision rule

The likelihood ratio is

$\Lambda(r) = \frac{p(r \mid H_1)}{p(r \mid H_0)} = \exp\!\left(\frac{-(r - \sqrt{E_b})^2 + (r + \sqrt{E_b})^2}{N_0}\right) = \exp\!\left(\frac{4r\sqrt{E_b}}{N_0}\right)$

The ML test $\Lambda(r) \gtrless 1$ simplifies to $r \gtrless 0$ .

Bit error probability

By symmetry, $P_e = P(r < 0 \mid H_1)$ :

$P_e = P\!\left(\sqrt{E_b} + w < 0\right) = P\!\left(w < -\sqrt{E_b}\right) = Q\!\left(\frac{\sqrt{E_b}}{\sqrt{N_0/2}}\right) = Q\!\left(\sqrt{\frac{2E_b}{N_0}}\right)$

MAP decision rule

The MAP threshold is $\eta = P(H_0)/P(H_1) = 0.3/0.7$ . The decision rule becomes

$r \underset{H_0}{\overset{H_1}{\gtrless}} \frac{N_0}{4\sqrt{E_b}} \ln\frac{P(H_0)}{P(H_1)} = \frac{N_0}{4\sqrt{E_b}} \ln\frac{0.3}{0.7}$

The threshold shifts left (toward $s_0$ ) because $H_1$ is more likely, expanding the decision region for $H_1$ . $\blacksquare$

Example: M-ary Detection (QPSK Decision Regions)

A QPSK constellation has signal points

$\mathbf{s}_m = \sqrt{E_s}\, [\cos(2\pi m/4 + \pi/4),\; \sin(2\pi m/4 + \pi/4)]^T, \quad m = 0, 1, 2, 3$

(a) Find the ML decision regions.

(b) Compute the symbol error probability.

(c) Show that $P_b = Q(\sqrt{2E_b/N_0})$ with Gray mapping.

Solution

Decision regions

By symmetry of the square constellation, the ML decision regions are the four quadrants:

$\mathcal{R}_0$ : $r_I > 0$ and $r_Q > 0$

$\mathcal{R}_1$ : $r_I < 0$ and $r_Q > 0$

$\mathcal{R}_2$ : $r_I < 0$ and $r_Q < 0$

$\mathcal{R}_3$ : $r_I > 0$ and $r_Q < 0$

Symbol error probability

A symbol is correct iff both the I and Q components are decoded correctly. Each component is an independent BPSK decision with energy $E_s/2 = E_b$ :

$P_s = 1 - (1 - P_{\text{BPSK}})^2 = 1 - \left(1 - Q\!\left(\sqrt{2E_b/N_0}\right)\right)^2 \approx 2Q\!\left(\sqrt{2E_b/N_0}\right)$

at high SNR (neglecting the $Q^2$ term).

BER with Gray mapping

With Gray mapping, each symbol error causes exactly one bit error out of $\log_2 4 = 2$ bits. Therefore

$P_b = \frac{P_s}{\log_2 M} \approx \frac{2Q(\sqrt{2E_b/N_0})}{2} = Q\!\left(\sqrt{2E_b/N_0}\right)$

This is identical to BPSK, confirming that QPSK achieves the same $P_b$ as BPSK while transmitting twice the bits per symbol. $\blacksquare$

ML vs MAP Decision Regions

Explore how prior probabilities shift the decision boundaries. With equal priors (ML), boundaries are perpendicular bisectors of the segment between signal points. As the prior probability of $H_0$ increases, the MAP boundary shifts toward $H_1$ (expanding the $H_0$ region). Observe how the overall error probability changes with the prior.

Parameters

Prior

P(H_0)

0.5

SNR (dB)10

Modulation scheme

MAP vs ML Decision Boundary Animation

Watch how the MAP decision boundary shifts as the prior probability

P(H_0)

sweeps from 0.1 to 0.9. When

P(H_0) > 0.5

, the boundary shifts toward

H_1

(expanding

\mathcal{R}_0

). At

P(H_0) = 0.5

, the MAP boundary coincides with the ML boundary.

The MAP decision boundary (green) shifts relative to the fixed ML boundary (yellow) as the prior

P(H_0)

varies.

Quick Check

A binary communication system has $P(H_0) = 0.9$ and $P(H_1) = 0.1$ . Compared to the ML detector, the MAP detector will:

Shift the decision boundary toward $s_1$ , reducing overall $P_e$

Keep the same decision boundary as ML

Shift the decision boundary toward $s_0$ , reducing overall $P_e$

Always achieve lower $P_e$ than ML regardless of the priors

Correction:

Shift the decision boundary toward

s_1

, reducing overall

P_e

The MAP threshold is $\eta = P(H_0)/P(H_1) = 9$ . Since $H_0$ is much more likely, the MAP detector expands $\mathcal{R}_0$ by shifting the boundary toward $s_1$ . This increases $P(e \mid H_1)$ but greatly decreases $P(e \mid H_0)$ , and since $P(H_0) = 0.9$ dominates, the overall $P_e$ decreases.

Common Mistake: Forgetting Prior Probabilities in MAP Detection

Mistake:

Using the ML decision rule (minimum distance) when the transmitted symbols have significantly unequal prior probabilities, and expecting minimum overall error probability.

Correction:

The ML detector is optimal only for uniform priors. When priors are unequal, the MAP detector shifts the decision boundary toward the less likely symbol and achieves strictly lower total $P_e$ .

In practice, source coding and scrambling make the bit probabilities approximately $P(0) \approx P(1) \approx 0.5$ , so ML is usually a good approximation. However, in systems with unequal symbol probabilities (e.g., probabilistic constellation shaping), using ML instead of MAP wastes the shaping gain.

Historical Note: Neyman and Pearson

1933

Jerzy Neyman and Egon Pearson published their fundamental lemma on optimal hypothesis testing in 1933, establishing that the likelihood ratio test is the most powerful test at any given significance level. Their work, originally motivated by problems in biological statistics, became the foundation of statistical decision theory and was later adopted by the radar and communications communities in the 1940s-50s. The Neyman-Pearson framework directly influenced the development of optimal receiver design by Woodward, Middleton, and Van Trees.

Historical Note: From Bayes to MAP Detection

1763-1961

Thomas Bayes' posthumously published theorem (1763) on inverse probability lay dormant for nearly two centuries before becoming central to statistical decision theory in the 1950s. The MAP detection rule — choose the hypothesis with the largest posterior probability — is a direct application of Bayes' theorem. Abraham Wald's sequential analysis (1947) and the subsequent development of Bayesian decision theory by Raiffa and Schlaifer (1961) established the framework used in modern communications.

Hypothesis Test

A statistical decision procedure that selects one of two or more hypotheses based on observed data. In digital communications, each hypothesis corresponds to a possible transmitted symbol, and the test determines the detected symbol.

Likelihood Ratio

The ratio $\Lambda(y) = p(y \mid H_1) / p(y \mid H_0)$ of the conditional densities of the observation under the two hypotheses. Every optimal detection rule can be expressed as a comparison of the likelihood ratio against a threshold.

Hypothesis Testing and MAP/ML Detection