The Bayes Decision Rule
When Priors Are Available
Suppose we know --- from long experience or from the system design --- that target-present events occur with prior probability . Suppose further that false alarms and misses have quantifiable costs: a missed cancer diagnosis is catastrophic, a false burglar alarm is merely annoying. The Bayes framework bundles all of this side information into a single scalar criterion, the Bayes risk, and produces the decision rule that minimises it. The Bayes rule is the gold standard against which every other detector is measured.
Definition: Bayes Risk
Bayes Risk
Let , be the prior probabilities, and let be the cost of deciding when is true. The Bayes risk of a decision rule is Expanding, A Bayes rule is any decision rule that minimises .
We always assume the reasonable cost condition and --- erring is more expensive than deciding correctly under each hypothesis. Without this condition the "optimal" rule may ignore the data entirely.
Theorem: Bayes-Optimal Decision Rule
Under the reasonable cost condition, the Bayes-optimal decision rule decides on the set and on the complement. Ties (the set ) may be assigned arbitrarily without affecting the risk.
Pointwise, the Bayes rule compares the posterior-weighted risk of each decision at and picks the cheaper one. Because posteriors are proportional to (prior likelihood), the comparison reduces to a likelihood-ratio test against a threshold that encodes both the priors and the cost structure.
Write the Bayes risk as an integral over and separate the contributions from and .
Collect all terms that depend on --- these are the only free degrees of freedom.
Assign to iff the integrand at is negative (it lowers the risk).
Express the risk as an integral
Using and , and , ,
Move to a single integral
Using ,
Pointwise optimisation
The first integral does not depend on the decision rule. To minimise the second, include in exactly when its integrand is strictly negative:
Rearrange into LRT form
Using the reasonable-cost condition (both differences positive), Any rule deciding on this set and on its complement achieves the minimum risk. Tie assignments affect a set of zero measure under Lebesgue-absolutely-continuous densities, so the risk is unchanged.
Definition: Maximum a Posteriori (MAP) Rule
Maximum a Posteriori (MAP) Rule
The 0-1 cost (or uniform cost) structure is , , for which Bayes risk reduces to average error probability . The corresponding Bayes rule is the MAP rule: Equivalently, the MAP rule picks the hypothesis with the larger posterior probability .
Under equal priors , the MAP rule further reduces to the maximum likelihood (ML) rule: decide iff .
Example: Bayes Rule for Two Gaussians
Under , ; under , . Priors are and costs are 0-1 (MAP rule). Derive the explicit decision threshold on .
Write the likelihood ratio
With , , , the ratio is
Apply the MAP threshold
Decide iff , i.e.,
Interpret the threshold
If , then : the decision boundary is the midpoint of the two means, by symmetry. If (the alternative is more likely a priori), , so : the rule is biased toward , enlarging .
Bayes Risk as a Function of Threshold and Prior
For the Gaussian problem of EBayes Rule for Two Gaussians, plot the Bayes risk as a function of the threshold for different prior probabilities . The minimiser is the Bayes threshold .
Parameters
Half-separation of Gaussian means
Common standard deviation
Prior probability of $\mathcal{H}_1$
Cost of a false alarm
Cost of a miss
Posterior Reformulation
The MAP rule admits the transparent posterior form Under 0-1 costs, the Bayes-optimal strategy is to choose whichever hypothesis has the higher posterior probability given the observation. This is the most natural Bayesian statement: update beliefs via Bayes' rule, then pick the most probable hypothesis.
Common Mistake: Tie-Breaking and Equality in the LRT
Mistake:
Worrying extensively about the convention at .
Correction:
For Lebesgue-absolutely-continuous , the set has Lebesgue measure zero, so the choice "" vs. "" in the LRT does not affect , , or the Bayes risk. The tie set matters only for discrete observations (Section 1.4, randomised tests).
Historical Note: Reverend Bayes and the Long Road to Decision Theory
1763-1950sThomas Bayes (1701-1761), an English Presbyterian minister and amateur mathematician, wrote the manuscript that would become "An Essay towards Solving a Problem in the Doctrine of Chances" --- posthumously published in 1763 by his friend Richard Price. The essay contained what we now call Bayes' theorem, though only in a special case.
For nearly two centuries, Bayesian reasoning was marginalised by frequentists who viewed priors as unscientific. The rehabilitation began in the 1940s-1950s with the decision-theoretic foundations laid by Abraham Wald (1902-1950), who introduced loss functions and risk minimisation as the unifying language of statistics. It was Wald who showed that under mild regularity, every admissible decision rule is a Bayes rule for some prior --- a striking vindication of the Bayesian viewpoint from a purely frequentist optimality criterion.
Quick Check
Under 0-1 costs with priors , the MAP rule decides iff exceeds which threshold?
For 0-1 costs, . A priori rare alternatives require stronger evidence to be accepted.
Posterior probability
Given observation , the posterior probability of is . It represents the updated belief about the state of nature after seeing , and is the quantity the MAP rule maximises.
Related: prior probability, Bayes-Optimal Decision Rule, Maximum a Posteriori (MAP) Rule
Bayes vs. Neyman-Pearson Formulations
| Aspect | Bayesian | Neyman-Pearson |
|---|---|---|
| Requires priors | Yes () | No |
| Requires cost matrix | Yes () | No |
| Criterion | Minimise Bayes risk | Maximise s.t. |
| Optimal rule | LRT with | LRT with s.t. |
| Interpretation | Expected cost | Worst-case false alarm |
| Typical domain | Communications, ML, medical | Radar, hypothesis screening |