The Decision Problem

Why Binary Hypothesis Testing?

The point is that every receiver --- whether it demodulates a BPSK symbol, detects a radar target, classifies a radiology image, or screens for spam --- ultimately answers a yes/no question from noisy evidence. Binary hypothesis testing is the atomic unit of statistical inference: the smallest non-trivial decision problem, for which we can derive, analyse, and bound the optimal detector in closed form.

Once we understand the binary case completely, MM-ary detection, composite testing, and sequential decisions all follow as natural extensions. The likelihood ratio we introduce here will reappear --- renamed, reshaped, vectorised --- in every chapter of this book and in every receiver architecture you will ever design.

Definition:

Binary Hypothesis Testing Problem

A binary hypothesis testing problem consists of:

  1. Two hypotheses H0\mathcal{H}_0 (the null hypothesis) and H1\mathcal{H}_1 (the alternative hypothesis) about the state of nature.
  2. An observation space Y\mathcal{Y} (typically Rn\mathbb{R}^n or a countable set) together with an observation random variable Y∈YY \in \mathcal{Y}.
  3. Two conditional densities (or probability mass functions) f0(y)=f(y∣H0)f_0(y) = f(y \mid \mathcal{H}_0) and f1(y)=f(y∣H1)f_1(y) = f(y \mid \mathcal{H}_1), specifying how YY is distributed under each hypothesis.

A decision rule (or detector) is a measurable function g ⁣:Yβ†’{0,1},g\colon \mathcal{Y} \to \{0, 1\}, that maps each observation to a decision in favour of H0\mathcal{H}_0 or H1\mathcal{H}_1.

We say the hypotheses are simple when f0f_0 and f1f_1 are completely specified. When either density depends on unknown parameters (e.g., a signal of unknown amplitude), the hypothesis is composite --- treated in Chapter 2.

Definition:

Decision Regions

Every decision rule gg partitions the observation space into two disjoint decision regions Y0={y∈Y:g(y)=0},Y1={y∈Y:g(y)=1},\mathcal{Y}_0 = \{y \in \mathcal{Y} : g(y) = 0\}, \qquad \mathcal{Y}_1 = \{y \in \mathcal{Y} : g(y) = 1\}, with Y0βˆͺY1=Y\mathcal{Y}_0 \cup \mathcal{Y}_1 = \mathcal{Y} and Y0∩Y1=βˆ…\mathcal{Y}_0 \cap \mathcal{Y}_1 = \emptyset. Conversely, any such partition defines a decision rule. Designing a detector is therefore equivalent to choosing a partition of Y\mathcal{Y}.

Definition:

Type I and Type II Errors

For a decision rule gg with decision regions Y0,Y1\mathcal{Y}_0, \mathcal{Y}_1, define:

  • False-alarm probability (Type I error): Pf(g)β€…β€Š=β€…β€ŠP(g(Y)=1∣H0)β€…β€Š=β€…β€Šβˆ«Y1f0(y) dy.P_f(g) \;=\; P(g(Y) = 1 \mid \mathcal{H}_0) \;=\; \int_{\mathcal{Y}_1} f_0(y)\,dy.
  • Miss probability (Type II error): PM(g)β€…β€Š=β€…β€ŠP(g(Y)=0∣H1)β€…β€Š=β€…β€Šβˆ«Y0f1(y) dy.P_M(g) \;=\; P(g(Y) = 0 \mid \mathcal{H}_1) \;=\; \int_{\mathcal{Y}_0} f_1(y)\,dy.
  • Detection probability (power): Pd(g)β€…β€Š=β€…β€Š1βˆ’PM(g)β€…β€Š=β€…β€Šβˆ«Y1f1(y) dy.P_d(g) \;=\; 1 - P_M(g) \;=\; \int_{\mathcal{Y}_1} f_1(y)\,dy.

The names false alarm, miss, and detection come from radar, where H1\mathcal{H}_1 is the presence of a target. In statistics one speaks of size (PFP_F) and power (PDP_D).

Discrete observations use sums over Y0,Y1\mathcal{Y}_0, \mathcal{Y}_1 in place of integrals. All our results apply to both cases with the obvious substitution.

Key Takeaway

Shrinking PfP_f enlarges Y0\mathcal{Y}_0 and therefore increases PMP_M; shrinking PMP_M enlarges Y1\mathcal{Y}_1 and increases PfP_f. The fundamental tradeoff of detection is that the two error types are coupled through the same partition of Y\mathcal{Y} --- you cannot reduce both by tuning the detector alone. Only by acquiring more informative data can both be simultaneously shrunk.

Example: Binary Hypothesis Testing with a Gaussian Mean Shift

Let Y∼N(0,1)Y \sim \mathcal{N}(0, 1) under H0\mathcal{H}_0 and Y∼N(ΞΌ,1)Y \sim \mathcal{N}(\mu, 1) under H1\mathcal{H}_1, with ΞΌ>0\mu > 0. For a threshold detector g(y)=1{y>Ο„},g(y) = \mathbb{1}\{y > \tau\}, compute PfP_f and PdP_d as functions of Ο„\tau and ΞΌ\mu.

Overlap of Two Gaussians Under H0\mathcal{H}_0 and H1\mathcal{H}_1

Vary the mean separation ΞΌ\mu and the threshold Ο„\tau. The shaded regions are the false-alarm area (blue, right of Ο„\tau under f0f_0) and the miss area (red, left of Ο„\tau under f1f_1).

Parameters
1.5

Mean separation between $\mathcal{H}_0$ and $\mathcal{H}_1$

1

Common standard deviation

0.75

Decision threshold

Historical Note: Neyman, Pearson, and the Birth of Hypothesis Testing

1920s-1930s

Modern hypothesis testing emerged from a famous collaboration between the Polish statistician Jerzy Neyman (1894-1981) and the English statistician Egon Pearson (1895-1980), son of Karl Pearson. Between 1928 and 1933, working by mail across London and Warsaw, they developed the framework of two hypotheses (H0\mathcal{H}_0 and H1\mathcal{H}_1), the distinction between Type I and Type II errors, and the optimality notion that would become the Neyman-Pearson lemma (Section 1.4).

Their approach broke from Ronald Fisher's significance-testing tradition (which considered only the null hypothesis) and supplied the operational vocabulary --- size, power, critical region --- that every radar engineer, quality inspector, and clinical trialist uses today. Neyman emigrated to Berkeley in 1938 and founded the influential Berkeley statistics department; Pearson succeeded his father at University College London.

Common Mistake: The Direction of Error Probabilities

Mistake:

Treating PdP_d and PfP_f symmetrically and writing "Pd+Pf=1P_d + P_f = 1".

Correction:

The correct relation is Pd+PM=1P_d + P_M = 1 because {g=1}\{g=1\} and {g=0}\{g=0\} are complementary events under a fixed hypothesis. PdP_d and PfP_f, by contrast, are computed under different hypotheses and are coupled only through the decision region. For a random detector that flips a fair coin, Pf=Pd=1/2P_f = P_d = 1/2, showing Pf+PdP_f + P_d can equal anything in [0,2][0, 2].

Quick Check

A detector has Pf=0.1P_f = 0.1 and Pd=0.8P_d = 0.8. What is its miss probability PMP_M?

PM=0.2P_M = 0.2

PM=0.9P_M = 0.9

PM=0.1P_M = 0.1

Cannot be determined without Ο€0,Ο€1\pi_0, \pi_1

Type I error (false alarm)

The event of deciding H1\mathcal{H}_1 when H0\mathcal{H}_0 is true. Its probability Pf=P(g=1∣H0)P_f = P(g=1 \mid \mathcal{H}_0) is also called the size of the test in classical statistics and the false-alarm rate in radar.

Related: Type II error (miss), significance level, ROC curve

Type II error (miss)

The event of deciding H0\mathcal{H}_0 when H1\mathcal{H}_1 is true. Its probability PM=P(g=0∣H1)=1βˆ’PdP_M = P(g=0 \mid \mathcal{H}_1) = 1 - P_d. The complementary quantity Pd=1βˆ’PMP_d = 1 - P_M is called the power of the test.

Related: Type I error (false alarm), detection probability, Power

Why This Matters: From Binary Hypothesis Testing to BPSK Demodulation

In a BPSK receiver, the transmitter sends either +Es+\sqrt{E_s} (bit 1) or βˆ’Es-\sqrt{E_s} (bit 0) over an AWGN channel. The receiver observes Y=Β±Es+WY = \pm\sqrt{E_s} + W with W∼N(0,Οƒ2)W \sim \mathcal{N}(0, \sigma^2), and must decide which bit was sent. This is exactly the Gaussian mean-shift problem of EBinary Hypothesis Testing with a Gaussian Mean Shift with ΞΌ=2Es/Οƒ21/2\mu = 2\sqrt{E_s}/{\sigma^2}^{1/2} (in the centred formulation). The LRT we develop in Section 1.3 collapses to threshold detection at zero, and the error probability Pe=Q(2Es/N0)P_e = Q(\sqrt{2E_s/N_0}) is the celebrated BPSK formula. Chapter 2 returns to this link with the general vector-AWGN theory.