The Likelihood Ratio Test
One Statistic to Rule Them All
The striking fact that emerged from Section 1.2 is this: for every reasonable cost structure and every pair of priors, the optimal rule compares the same function of the data --- the likelihood ratio --- to a threshold that absorbs all the problem-specific constants. Change the priors, change the costs, or move to the Neyman-Pearson criterion of Section 1.4: only the threshold changes. The likelihood ratio is the universal sufficient statistic for binary testing, and understanding its distribution is tantamount to understanding every binary detector.
Definition: Likelihood Ratio Test (LRT)
Likelihood Ratio Test (LRT)
A likelihood ratio test with threshold is the decision rule where is the likelihood ratio. When the decision may be 0, 1, or (for randomised tests) random. The log-likelihood ratio is and the LRT is equivalently .
Working with is almost always more convenient: for product densities the LLR adds, giving . Log also converts exponential-family densities to affine statistics.
Theorem: Sufficiency of the Likelihood Ratio
For binary hypothesis testing, the likelihood ratio is a sufficient statistic. That is, the conditional distribution of given does not depend on which hypothesis is true, and any Bayes-optimal decision rule is a function of alone.
Sufficiency means the LR captures every bit of information about that contains. Knowing beyond its likelihood ratio tells you nothing more about which hypothesis is true --- the "extra" randomness is independent of . This is why every optimal detector (Bayes, MAP, ML, Neyman-Pearson) compresses down to before making its call.
Recall the Fisher-Neyman factorization criterion: is sufficient iff for each , with free of .
Set and try to express and through and a common factor.
Factorization criterion
Define . Then trivially Setting , , we obtain for , so is sufficient by the Fisher-Neyman factorization theorem.
Bayes rule factors through the LR
The Bayes rule of TBayes-Optimal Decision Rule is , i.e., a measurable function of alone. By the Rao-Blackwell principle, no rule using more of can improve upon it.
Theorem: Invariance under Monotone Transformations
Let be strictly increasing. Then the decision rule is identical to the LRT for every observation . In particular, the LRT is equivalent to any monotone transformation of the LR (, , linear, , ...).
Monotonicity preserves ordering
Because is strictly increasing, iff . The decision regions and coincide. Consequently the error probabilities are unchanged.
Example: LRT for i.i.d. Gaussian Samples
Observations are i.i.d. with Derive the LRT as a rule on the sample mean .
Write the log-likelihood ratio
For each sample, Summing over ,
Reduce to a test on $\bar y$
Because , the map is strictly increasing, and the LRT is equivalent to The sufficient statistic is the sample mean --- all other aspects of are irrelevant.
Distribution of $\bar Y$ under each hypothesis
Under , ; under , . Hence The effective SNR of the test grows linearly with : .
Compute the LLR, Not the LR
In practice, always compute the log-likelihood ratio, never the likelihood ratio directly. Three reasons:
- Underflow/overflow. For a block of i.i.d. samples with typical SNR, can easily exceed or fall below , overflowing double precision. The LLR remains an -magnitude real.
- Addition beats multiplication. i.i.d. samples give , a single vectorised summation. Each summand is typically , keeping precision throughout.
- LDPC / turbo decoders. Every modern iterative decoder passes LLRs as messages. The LLR representation is the universal interchange format for soft information.
- β’
Represent posteriors/priors as log-domain quantities
- β’
Use log-sum-exp when marginalising (converts to a numerically stable form)
LRT Decision Regions in
Two-dimensional observation , with vs. . The LRT decision boundary is a hyperplane; changing the threshold shifts it parallel to itself.
Parameters
Log-threshold of the LRT
The Monotone Likelihood Ratio Property
A family of densities has the monotone likelihood ratio (MLR) property in if is a non-decreasing function of whenever . The Gaussian shift family, binomial in , Poisson in , and all one-parameter exponential families in their natural parameter possess MLR. When MLR holds, LRTs reduce to simple threshold tests on , and the Neyman-Pearson lemma extends to one-sided composite alternatives (Chapter 2).
Common Mistake: Two-Sided Alternatives Are Not LRTs on a Single Statistic
Mistake:
Assuming every "reasonable" rule can be written as a threshold test on a scalar statistic.
Correction:
If the alternative is two-sided (e.g., vs. ), the LR is not monotone in , and the LRT decision region is typically --- the absolute value of is the sufficient statistic, not itself. Always derive the decision region from the LR directly; do not assume it is a half-line.
Quick Check
Which of the following transformations of the likelihood ratio preserves the LRT decision rule?
Log is strictly increasing on , so monotone invariance (TInvariance under Monotone Transformations) applies.
Sufficient statistic
A statistic is sufficient for a hypothesis if the conditional distribution of given does not depend on . Equivalently, by the Fisher-Neyman factorization, . For binary testing, the likelihood ratio is always sufficient.
Related: likelihood ratio, Fisher-Neyman theorem, Rao--Blackwell Theorem
Historical Note: Fisher and the Likelihood Principle
1920sRonald A. Fisher (1890-1962) introduced the concept of likelihood as distinct from probability in his 1922 paper "On the mathematical foundations of theoretical statistics". Fisher argued that once data are observed, the likelihood function contains all the information the data carry about . Neyman and Pearson then showed that binary decisions based on ratios of likelihoods are optimal --- closing the loop between Fisher's likelihood principle and the operational decision-theoretic framework.