Sufficient Statistics for Detection

Why sufficient statistics matter

We have now seen the matched filter three times --- as LRT, as SNR-maximising linear filter, as continuous-time L2L^2 projection. The point is that every one of these derivations collapsed the full observation yRn\mathbf{y}\in\mathbb{R}^n (or y(t)L2y(t) \in L^2) into a single scalar statistic T=y,sT = \langle \mathbf{y}, \mathbf{s}\rangle. That compression is not accidental: TT is a sufficient statistic for the detection problem. Once we have it, the raw observation carries no additional information about which hypothesis is true. This section formalises sufficiency, states the Fisher--Neyman factorisation theorem, and uses it to explain why signal-space receivers for digital modulation reduce a waveform to its finite-dimensional projection.

Definition:

Sufficient Statistic

Let Y\mathbf{Y} be an observation with density f(y;θ)f(\mathbf{y};\theta) depending on a parameter θΘ\theta \in \Theta (here θ\theta indexes the hypothesis). A statistic T(Y)T(\mathbf{Y}) is sufficient for θ\theta if the conditional distribution of Y\mathbf{Y} given T(Y)=tT(\mathbf{Y}) = t does not depend on θ\theta: f(yT(y)=t;θ)=f(yT(y)=t).f(\mathbf{y} \mid T(\mathbf{y}) = t;\theta) = f(\mathbf{y} \mid T(\mathbf{y}) = t).

Theorem: Fisher--Neyman Factorisation

A statistic T(Y)T(\mathbf{Y}) is sufficient for θ\theta if and only if the density admits the factorisation f(y;θ)=g(T(y),θ)h(y),f(\mathbf{y};\theta) = g(T(\mathbf{y}),\theta)\, h(\mathbf{y}), where gg depends on y\mathbf{y} only through T(y)T(\mathbf{y}) and hh does not depend on θ\theta.

Any dependence on θ\theta enters only through TT --- so TT captures all the parameter-relevant information.

Example: The Matched-Filter Output is Sufficient for Detection in AWGN

Show that for the binary problem H0:Y=W\mathcal{H}_0: \mathbf{Y} = \mathbf{W} versus H1:Y=s+W\mathcal{H}_1: \mathbf{Y} = \mathbf{s} + \mathbf{W} with WN(0,σ2I)\mathbf{W} \sim \mathcal{N}(\mathbf{0}, \sigma^2\mathbf{I}), the matched-filter statistic T(y)=sTyT(\mathbf{y}) = \mathbf{s}^{\mathsf T}\mathbf{y} is sufficient for the hypothesis index.

Dimensionality reduction for free

In the preceding example, the observation lives in Rn\mathbb{R}^n but the sufficient statistic is a scalar. That collapse --- from nn dimensions to 11 --- is not a numerical trick; it is a structural fact about the problem. Sufficient statistics pinpoint the minimum dimensionality needed for optimal inference. When we move to MM-ary hypothesis testing in Chapter 3, the sufficient statistic becomes a vector of M1M-1 projections. When we move to parameter estimation in Part II, sufficient statistics tell us how many numbers we need to keep from a dataset of size nn.

Theorem: Sufficiency of Signal-Space Projections for Waveform Detection

Consider the MM-ary detection problem in continuous-time AWGN: Hm:y(t)=sm(t)+w(t)\mathcal{H}_m: y(t) = s_m(t) + w(t), m=0,,M1m=0,\ldots,M-1, t[0,T]t\in[0,T], with w(t)w(t) white Gaussian noise of PSD N0/2N_0/2. Let {ϕ1,,ϕN}\{\phi_1,\ldots,\phi_N\} with NMN \leq M be an orthonormal basis of span{s0,,sM1}\operatorname{span}\{s_0,\ldots,s_{M-1}\} in L2[0,T]L^2[0,T]. The vector of projections YprojRN\mathbf{Y}_{\mathrm{proj}} \in \mathbb{R}^N with components Yk=0Ty(t)ϕk(t)dtY_k = \int_0^T y(t)\phi_k(t)\,dt is a sufficient statistic for the hypothesis index.

,

Key Takeaway

The Gram--Schmidt-constructed projections form a sufficient statistic for MM-ary signal detection in AWGN. This is why every digital receiver in Chapters 8--10 of the telecom book is drawn as correlator bank + minimum distance decoder: correlation extracts the sufficient statistic, and the remaining scalar noise outside the signal subspace is discarded without loss.

Visualising the Sufficient-Statistic Collapse

A high-dimensional observation vector projected onto the signal subspace. The perpendicular component is noise-only and carries no information about the hypothesis.

Parameters
0.7

Why This Matters: From Sufficiency to the MIMO Receiver

The sufficiency argument generalises directly to MIMO receivers: the matched-filter bank HHy\mathbf{H}^H \mathbf{y} collects sufficient statistics for detecting the symbol vector, and everything downstream (ZF, MMSE, sphere decoding) operates on these projected observations. Chapter 15 of the telecom book builds on this fact.

Common Mistake: Sufficiency can fail when parameters are unknown

Mistake:

Assuming that the matched-filter output T=sTyT = \mathbf{s}^{\mathsf T}\mathbf{y} is still sufficient when the signal amplitude AA is unknown.

Correction:

When s\mathbf{s} is replaced by AsA\mathbf{s} with AA an unknown parameter, the sufficient statistic must carry enough information to infer AA as well --- typically (sTy,y2)(\mathbf{s}^{\mathsf T}\mathbf{y},\|\mathbf{y}\|^2) for Gaussian noise. The GLRT from §2 is exactly the construction that uses this larger sufficient statistic correctly.

Quick Check

Why is the inner product sTy\mathbf{s}^{\mathsf T}\mathbf{y} sufficient for detecting a known signal in white Gaussian noise?

Because it has maximum variance among all linear statistics.

Because the likelihood ratio depends on y\mathbf{y} only through this inner product (Fisher--Neyman).

Because the component of y\mathbf{y} perpendicular to s\mathbf{s} is always zero.

Because the noise is Gaussian.

Sufficient statistic

A function T(Y)T(\mathbf{Y}) of the observations such that the conditional distribution of Y\mathbf{Y} given TT is free of the parameter being inferred. Sufficient statistics preserve all information about the parameter while reducing dimensionality.

Related: Fisher--Neyman Factorisation, Minimal Sufficient Statistic

🎓CommIT Contribution(2023)

Subspace Matched Filters for Joint Sensing and Communication

G. Caire, S. Saur, A. BazziIEEE Journal on Selected Areas in Information Theory

The sufficient-statistic view developed here extends naturally to integrated sensing and communication (ISAC) systems, where the same waveform must carry information and probe the environment. The CommIT group has shown that the optimal ISAC receiver decomposes the observation into orthogonal subspaces carrying (i) the communication payload and (ii) the sensing parameters, with each subspace admitting its own matched filter. The Fisher--Neyman machinery from this section is the formal underpinning of that decomposition.

isacsufficient-statisticsmatched-filter
⚠️Engineering Note

Sufficiency determines ADC and sampling requirements

In practice sufficiency guides system design: you only need to sample and digitise in the signal subspace. For a PAM/QAM receiver with MM symbols on pulse shape p(t)p(t), a single matched-filter output per symbol is sufficient --- you do not need to oversample and then post-process. This is why real receivers use symbol-rate sampling after the matched filter, cutting the ADC data rate to the signal bandwidth.