Chapter Summary

Key Points

1.
The decision problem. A binary hypothesis test is specified by two densities $f_0, f_1$ on an observation space $\mathcal{Y}$ and a decision rule $g\colon \mathcal{Y}\to\{0,1\}$ . Performance is captured by two scalars --- false-alarm $P_f$ and detection $P_d$ (or equivalently miss $P_M = 1 - P_d$ ) --- that trade off through the choice of decision region.
2.
Bayes-optimal rule. With priors $\pi_0, \pi_1$ and costs $C_{ij}$ , the rule minimising Bayes risk is the LRT with threshold $\tau^\star = \pi_0(C_{10}-C_{00})/[\pi_1(C_{01}-C_{11})]$ . For 0-1 costs this is the MAP rule: decide the hypothesis with the larger posterior.
3.
The likelihood ratio is sufficient. $L(y) = f_1(y)/f_0(y)$ captures all information in $y$ relevant to the decision. Every Bayes-optimal rule, every Neyman-Pearson rule, every ML rule is a threshold test on $L(Y)$ (or its monotone transform, the LLR). Compute in log-domain for numerical stability.
4.
Neyman-Pearson lemma. Without priors, fix $P_f \leq \alpha$ and maximise $P_d$ : the solution is the (possibly randomised) LRT with threshold chosen to make $P_f = \alpha$ . Proved via a variational argument that every other rule wastes budget.
5.
ROC geometry. Sweeping the LRT threshold traces the ROC curve in $(P_f, P_d)$ space. The ROC is monotone, lies above the diagonal, is concave, and has slope equal to the LRT threshold at each operating point. AUC summarises separability in a single scalar.
6.
Bhattacharyya and Chernoff bounds. The MAP error satisfies $P_e^\star \leq \sqrt{\pi_0\pi_1}\,\rho_B$ (Bhattacharyya) and more generally $P_e^\star \leq \pi_0^{1-s}\pi_1^s e^{-\mu(s)}$ for every $s \in [0,1]$ (Chernoff), where $\mu(s)$ is concave and vanishes at the endpoints.
7.
Chernoff information is the error exponent. $C(f_0,f_1) = \max_s \mu(s)$ is the exponential rate at which $P_e^\star$ decays with the number of i.i.d. observations. For equal-variance Gaussians, $C = d^2/8$ with $d$ the normalised mean separation, and the Chernoff-optimal tilt is $s^\star = 1/2$ .
8.
Connection to coding and information theory. The same exponent structure drives random-coding error bounds (ITA Ch. 4) and CFAR radar detection. The LLR representation is the universal interchange format used in LDPC/turbo decoders (CC book) and in message passing (FSI Part V).

Looking Ahead

Chapter 2 applies these ideas to detection of known signals in additive Gaussian noise, deriving the matched filter as the LRT test statistic and obtaining the celebrated BPSK error probability $P_e = Q(\sqrt{2E_s/N_0})$ . Chapter 3 introduces composite hypothesis testing via the GLRT, relaxing the assumption of fully-known densities. Chapter 4 extends to colored Gaussian noise via prewhitening, and to continuous-time detection via the $L^2$ signal-space viewpoint. The likelihood ratio we have introduced here will reappear in every subsequent chapter --- as a message in a factor graph, as a posterior in Bayesian estimation, as a pairwise comparison in coded-system error analysis.

Performance Bounds: Bhattacharyya and Chernoff Exercises