Chapter Summary

Chapter Summary

Key Points

  • 1.

    The decision problem. A binary hypothesis test is specified by two densities f0,f1f_0, f_1 on an observation space Y\mathcal{Y} and a decision rule g ⁣:Yβ†’{0,1}g\colon \mathcal{Y}\to\{0,1\}. Performance is captured by two scalars --- false-alarm PfP_f and detection PdP_d (or equivalently miss PM=1βˆ’PdP_M = 1 - P_d) --- that trade off through the choice of decision region.

  • 2.

    Bayes-optimal rule. With priors Ο€0,Ο€1\pi_0, \pi_1 and costs CijC_{ij}, the rule minimising Bayes risk is the LRT with threshold τ⋆=Ο€0(C10βˆ’C00)/[Ο€1(C01βˆ’C11)]\tau^\star = \pi_0(C_{10}-C_{00})/[\pi_1(C_{01}-C_{11})]. For 0-1 costs this is the MAP rule: decide the hypothesis with the larger posterior.

  • 3.

    The likelihood ratio is sufficient. L(y)=f1(y)/f0(y)L(y) = f_1(y)/f_0(y) captures all information in yy relevant to the decision. Every Bayes-optimal rule, every Neyman-Pearson rule, every ML rule is a threshold test on L(Y)L(Y) (or its monotone transform, the LLR). Compute in log-domain for numerical stability.

  • 4.

    Neyman-Pearson lemma. Without priors, fix Pf≀αP_f \leq \alpha and maximise PdP_d: the solution is the (possibly randomised) LRT with threshold chosen to make Pf=Ξ±P_f = \alpha. Proved via a variational argument that every other rule wastes budget.

  • 5.

    ROC geometry. Sweeping the LRT threshold traces the ROC curve in (Pf,Pd)(P_f, P_d) space. The ROC is monotone, lies above the diagonal, is concave, and has slope equal to the LRT threshold at each operating point. AUC summarises separability in a single scalar.

  • 6.

    Bhattacharyya and Chernoff bounds. The MAP error satisfies Pe⋆≀π0Ο€1 ρBP_e^\star \leq \sqrt{\pi_0\pi_1}\,\rho_B (Bhattacharyya) and more generally Pe⋆≀π01βˆ’sΟ€1seβˆ’ΞΌ(s)P_e^\star \leq \pi_0^{1-s}\pi_1^s e^{-\mu(s)} for every s∈[0,1]s \in [0,1] (Chernoff), where ΞΌ(s)\mu(s) is concave and vanishes at the endpoints.

  • 7.

    Chernoff information is the error exponent. C(f0,f1)=max⁑sΞΌ(s)C(f_0,f_1) = \max_s \mu(s) is the exponential rate at which Pe⋆P_e^\star decays with the number of i.i.d. observations. For equal-variance Gaussians, C=d2/8C = d^2/8 with dd the normalised mean separation, and the Chernoff-optimal tilt is s⋆=1/2s^\star = 1/2.

  • 8.

    Connection to coding and information theory. The same exponent structure drives random-coding error bounds (ITA Ch. 4) and CFAR radar detection. The LLR representation is the universal interchange format used in LDPC/turbo decoders (CC book) and in message passing (FSI Part V).

Looking Ahead

Chapter 2 applies these ideas to detection of known signals in additive Gaussian noise, deriving the matched filter as the LRT test statistic and obtaining the celebrated BPSK error probability Pe=Q(2Es/N0)P_e = Q(\sqrt{2E_s/N_0}). Chapter 3 introduces composite hypothesis testing via the GLRT, relaxing the assumption of fully-known densities. Chapter 4 extends to colored Gaussian noise via prewhitening, and to continuous-time detection via the L2L^2 signal-space viewpoint. The likelihood ratio we have introduced here will reappear in every subsequent chapter --- as a message in a factor graph, as a posterior in Bayesian estimation, as a pairwise comparison in coded-system error analysis.