Exercises
ex-fsi-ch01-01
EasyUnder , ; under , . For the detector , compute , , and .
Write with .
Use the Q-function: for .
False alarm
Under , , so :
Detection
Under , , so :
Miss
.
ex-fsi-ch01-02
EasyCompute the likelihood ratio for the exponential problem vs. with . Show that the LRT is equivalent to a threshold test on .
Use the exponential density for .
The LRT inequality is . Isolate .
LR
$
Reduce to threshold on $y$
iff . Since , divide by : The LRT is a one-sided threshold test on --- larger favours the smaller-rate hypothesis .
ex-fsi-ch01-03
EasyFor equal priors and 0-1 costs with , compute the MAP error probability .
By symmetry the MAP rule is .
and by symmetry .
MAP rule
Symmetry: iff , so the MAP rule is .
Error probability
. By symmetry .
ex-fsi-ch01-04
EasyShow that the Bhattacharyya coefficient satisfies , with iff a.e.
Apply Cauchy-Schwarz to .
Equality in Cauchy-Schwarz happens when the integrands are proportional.
Upper bound via Cauchy-Schwarz
$
Equality condition
Equality in Cauchy-Schwarz requires a.e., i.e., a.e. Integrating, , so (densities are non-negative) and a.e.
Lower bound
so , with equality iff a.e. (disjoint supports).
ex-fsi-ch01-05
EasyFor BPSK over AWGN, with equiprobable and . Derive the MAP decision rule and compute .
MAP rule
Equal priors, 0-1 costs, , means . Threshold by symmetry; decide iff .
Error probability
. This is the classical BPSK error formula.
ex-fsi-ch01-06
MediumFor vs. , derive the Neyman-Pearson test at level and compute its power .
Compute the LR on each region and .
The LR is a step function; the NP test is thus a simple region test.
Likelihood ratio
on , elsewhere; on , elsewhere. Therefore
Two-level LRT
For : decide everywhere in , so . For : decide only on , giving , . To achieve , randomise at : decide on with probability 1 and on with probability , so .
Power
.
ex-fsi-ch01-07
MediumLet be i.i.d. Bernoulli() under with , and under with . Derive the LRT and show the sufficient statistic is .
The joint pmf is where .
Take logs and keep terms depending on .
LLR
$
Sufficient statistic
The coefficient of is (since ). Thus the LLR is strictly increasing in , so the LRT reduces to is the sufficient statistic: the decision depends on the data only through the count of ones.
ex-fsi-ch01-08
MediumProve that the Chernoff exponent equals the Kullback-Leibler divergence in the limits: and . (Signs depend on the sign convention; establish yours carefully.)
Use with .
At , . Compute .
Derivative at $s=0$
. At , , so
Derivative at $s=1$
Using , , so .
Interpretation
starts at zero with positive slope , rises to a peak , and returns to zero at with negative slope . The two KL divergences are the Stein exponents of the one-sided errors in the NP framework.
ex-fsi-ch01-09
MediumConsider the Gaussian pair , with (variance shift). Derive the LRT sufficient statistic and show it is not a threshold test on .
Compute and look at its dependence on .
You should find a dependence on .
LLR
$
Sufficient statistic
Because , the coefficient of is positive, so the LLR is monotone increasing in . The sufficient statistic is therefore , and the LRT is a two-sided test A one-sided threshold on would not be optimal: both very large positive and very large negative are evidence for .
ex-fsi-ch01-10
MediumA radar system must achieve while maximising . The observation is i.i.d. samples under vs. . What is the minimum required to achieve ?
NP test is a threshold on . Under , .
Set and .
Threshold from size constraint
, so . With , .
Power constraint
requires . Thus and .
Interpretation
We need a mean shift of per sample (roughly standard deviations) to achieve the target operating point with samples. Each extra sample contributes a factor to detection SNR.
ex-fsi-ch01-11
MediumCompute the Bhattacharyya coefficient between and , and show that .
Compute and integrate from to .
The integrand is proportional to an exponential density.
Geometric mean
for .
Integrate
$
Check
When , (correct, since ). The ratio is the geometric-to-arithmetic mean ratio of the rates, always by AM-GM, with equality iff .
ex-fsi-ch01-12
MediumShow that for i.i.d. observations the Chernoff bound becomes where is computed for a single sample. Interpret the exponent.
Use independence: .
The Chernoff-bound integral factors over samples.
Factor the Chernoff integrand
.
Integrate
By Fubini, Thus .
Interpretation
Every sample contributes nats to the error exponent. The optimal tilt is independent of , but the maximised exponent multiplies . This is the key large-deviations phenomenon: exponential error decay with rate equal to the Chernoff information.
ex-fsi-ch01-13
HardProve that the ROC curve of any LRT is concave without using the slope identity. (Use only time-sharing and the Neyman-Pearson lemma.)
Take two operating points and on the ROC.
Construct a randomised detector that time-shares between the two LRTs.
Apply NP at the intermediate level .
Time-sharing construction
Let be the LRT at level , with power . Define the randomised detector that runs with probability and with probability , independently of .
Compute $\ntn{pfa}, \ntn{pd}$ of the mixture
By the law of total probability conditioned on the randomisation, and .
Apply the NP lemma
Let . The LRT at level has power , and by NP this is at least the power of any rule with , including . Therefore This is the concavity inequality.
ex-fsi-ch01-14
HardLet be densities on and let be any real-valued statistic. Prove the data-processing inequality for binary testing: the error exponent based on alone cannot exceed the exponent based on the full observation : where is the law of under .
Show that for any , .
Condition on and use Jensen's inequality on the convex function .
Conditional densities
Let be the conditional densities of given . Then .
Apply Hölder
For : By Hölder (or Jensen, since is concave on the unit simplex), (with equality iff a.e.). Hence .
Take logarithms
Therefore . Maximising over : .
When is equality achieved?
Equality holds iff a.e. for a.e. , i.e., the statistic is sufficient. This is another manifestation of sufficiency: a sufficient statistic preserves the full Chernoff information; any non-sufficient compression strictly loses.
ex-fsi-ch01-15
HardFor two Gaussians with different means and variances, , , derive and express the Chernoff information as a transcendental equation.
is proportional to a Gaussian density; complete the square.
The integrating constant gives .
Exponent algebra
contains and log-prefactor .
Completing the square
Let (effective precision). Define . Completing the square, the exponent is with .
Compute $\mu(s)$
Integrating yields After simplification,
Chernoff information
solves , a transcendental equation in . For it reduces to and , recovering EChernoff Information for Two Gaussians. For (pure variance shift), the first term vanishes and is a log function of the variance ratio.
ex-fsi-ch01-16
ChallengeStein's lemma. Show that for i.i.d. observations, under the Neyman-Pearson criterion with fixed, the miss probability decays as Sketch the proof via the LLR and the weak law of large numbers.
Under , a.s.
For the converse, use the information-theoretic inequality .
Achievability (upper bound on $P_M^\star$)
The NP test at level uses threshold chosen so that . Under , a.s., so . Under the LLR concentrates at , so with exponential rate (large deviations of under toward ).
Converse (lower bound on $P_M^\star$)
For any rule with size and power , the data-processing inequality for KL gives where the left side is the binary KL of the two Bernoulli laws on . Solving for : . Combining with the achievability yields Stein's exponent.
Comment
Stein's lemma says: in the asymmetric (NP) framework, one-sided error exponents are KL divergences. The symmetric (Bayes) counterpart is Chernoff information. Both theorems tie detection theory to information-theoretic divergences.
ex-fsi-ch01-17
ChallengeMinimax detection. Consider a binary test where the priors are unknown. The minimax rule chooses to minimise . Show that the minimax rule is the LRT whose threshold equalises the two error probabilities, and describe how to compute numerically.
The minimax criterion is equivalent to Bayes risk with the least-favourable prior.
Parametrise and the corresponding LRT threshold ; find such that .
Minimax-Bayes duality
For any prior and 0-1 costs, the Bayes rule is the LRT with threshold . Its risk is . The risk is a concave function of (as the minimum of affine functions). Let maximise ; this prior is least favourable.
Equalisation at $\pi_1^\star$
At the maximiser, , i.e., , so .
Minimax optimality
By the minimax theorem, the LRT at the least-favourable prior minimises the worst-case risk over all priors. Since for any rule at any prior (both terms are , one dominates), the LRT achieves at . Any other rule violates this for some prior.
Numerical procedure
- Parametrise .
- Compute and (both monotone in but in opposite directions).
- Solve by bisection or Newton's method on the increasing function .