CUSUM and Change Detection
Detecting When the World Changes
The SPRT assumes the hypothesis is fixed before the experiment begins. In many applications the hypothesis changes during the observation window. A user enters a cell, a jammer switches on, a link experiences a shadowing event, a machine begins to degrade. The statistician must detect the change as quickly as possible while avoiding false alarms.
Formally, observations are i.i.d.\ with density up to some unknown change-point , and with density afterwards. The goal is a stopping rule such that, roughly, (the detection delay) is small, while (the time to a false alarm when no change occurs) is large.
The CUSUM (cumulative sum) algorithm of Page (1954) is the sequential analogue of the likelihood-ratio test for change detection. Unlike the SPRT, CUSUM runs indefinitely without terminal decision, firing an alarm whenever evidence of a change accumulates beyond a threshold. Lorden (1971) and Moustakides (1986) showed that CUSUM is minimax optimal: it minimizes the worst-case expected detection delay among all procedures with a given false-alarm rate.
Definition: Page's CUSUM Statistic
Page's CUSUM Statistic
Let as before. The CUSUM statistic is the nonnegative process
Equivalently, : the current LLR minus its running minimum. The CUSUM stopping rule with threshold is
Intuitively, is the accumulated evidence in favor of since the last time the evidence favored most strongly. The reflection at zero is what distinguishes CUSUM from a plain running sum: it resets the statistic whenever past evidence points toward , so the detector is not distracted by ancient history.
CUSUM (Cumulative Sum)
A sequential change-detection algorithm that accumulates log-likelihood ratios with reflection at zero and raises an alarm when the running statistic exceeds a threshold. Optimal under Lorden's minimax criterion.
Related: ARL (Average Run Length), Change Point
ARL (Average Run Length)
is the mean time to a false alarm when no change ever occurs. is the mean detection delay when the change occurs at time zero. These play the roles of and detection delay in change detection.
Related: CUSUM (Cumulative Sum)
Change Point
The unknown time at which the distribution of the observations switches from to . means the change never happens.
Related: CUSUM (Cumulative Sum)
Theorem: CUSUM as a Sequence of One-Sided SPRTs
Let be the CUSUM stopping time, where . Then equals the first index at which some one-sided SPRT, started at some with upper boundary and lower boundary , crosses . Equivalently, CUSUM is the supremum over all hypothetical change-points of the LLR statistic.
The running minimum acts as a "restart clock." Every time the running LLR reaches a new low, the CUSUM forgets its past and begins a fresh one-sided SPRT. The alarm fires as soon as any of these imagined restarts crosses .
Write .
Interpret the increment as an SPRT statistic that started at time .
Express the CUSUM as a maximum
By definition, where the last sum is when .
Recognize the inner sum as an SPRT statistic
For each fixed "hypothetical start" , the inner sum is the cumulative LLR of a one-sided SPRT that began observing samples at time .
Conclude
The alarm fires when , i.e., when at least one of these hypothetical SPRTs has accumulated nats of evidence.
Theorem: CUSUM ARL Approximations
For the CUSUM with threshold and log-likelihood-ratio increments with pre- and post-change means and , neglecting excess over the boundary,
In particular, to a first approximation for large : the false-alarm ARL grows exponentially with the threshold, while the detection delay grows only linearly.
The asymmetry between linear detection delay and exponential false-alarm time is what makes CUSUM so effective: doubling quadruples while only doubling .
Detection delay via Wald's identity
Under (change at time ), is a random walk with positive drift reflected at . Since the drift is positive, reflection is rare and the walk hits in about steps on average, matching Wald's identity.
False alarm via renewal theory
Under , has negative drift . The walk spends most of its time near zero, and the hitting time of is the first success in a sequence of approximately independent "excursions" above zero, each of duration . Standard renewal analysis (see Siegmund 1985) yields the stated exponential scaling.
First-order approximation
For not too small, , and the quantity plays the role of in a fixed-sample test.
Key Takeaway
The CUSUM exchanges linear detection delay for exponential false-alarm time: versus . Every nat added to decuples at the cost of one additional sample of expected detection delay.
Example: CUSUM for a Gaussian Mean Shift
Observations are i.i.d.\ before the change and after, with known and . Design a CUSUM with and compute the expected detection delay.
LLR increment
\Delta = (\mu_1 - \mu_0)/\sigmaD(f_1 | f_0) = \Delta^2/2$.
Choose threshold
For , use such that , i.e., . For (1 standard deviation shift), nats.
Detection delay
samples. Interpretation: with a mean shift, the CUSUM raises a false alarm roughly once per samples and detects a real change within 17 samples of its onset.
CUSUM Change Detector
Complexity: memory and per-sample workThe two-sided CUSUM (for changes in either direction) runs two statistics in parallel with opposite signs and fires when either exceeds . Computational cost remains per sample.
CUSUM Statistic with Injected Change Point
Simulate a sample path of observations with a change from to at a chosen time . The upper panel shows the observations and the true change-point; the lower panel shows the CUSUM statistic and threshold. Adjust , , and to explore detection delay and false alarms.
Parameters
SPRT vs. CUSUM
| Property | SPRT | CUSUM |
|---|---|---|
| Problem | Fixed hypothesis, accept/reject | Detect change in distribution |
| Stopping | Terminal decision at | Alarm and reset |
| Thresholds | Two: , | One: |
| Reflection | No | Yes, at zero |
| Performance metric | and ASN | |
| Optimality | Wald-Wolfowitz (double ASN) | Lorden-Moustakides (minimax delay) |
| Typical use | Early-stop decoding, acceptance testing | Handover, jammer detection, quality control |
Why This Matters: Handover Triggering and Jammer Detection
Handover triggering. A UE continuously measures RSRP (reference signal received power) from serving and neighbor cells. The serving-cell RSRP is modeled as a Gaussian process centered on its slow-fading mean. When the UE moves into a new cell, the mean shifts. Detecting this shift quickly β but not so quickly that ping-pong handovers occur β is a classical change-detection problem. 3GPP events A1-A6 implement thresholded, hysteresis-based variants of CUSUM-like logic.
Jammer detection. A cognitive radio sensing receiver observes the noise-plus-interference power in its frequency band. When a jammer switches on, the mean power abruptly increases. A CUSUM on the log-power samples (or on the square of band-limited noise samples) declares the jammer within a few symbol periods, enabling fast frequency hopping or spectral notching.
Hysteresis vs. CUSUM in 3GPP Measurement Events
3GPP measurement events A3 ("neighbor becomes offset better than serving") employ a time-to-trigger (TTT) parameter that requires the condition to hold for a consecutive number of measurement periods before reporting. This is essentially a dwell-time test, not a CUSUM.
A strict CUSUM has the theoretical advantage of reacting earlier when the mean shift is large and later when the shift is small, adapting to the realized signal strength. Some recent handover algorithms exploit CUSUM-style adaptive dwell; they remain non-standardized but show 10-30% ping-pong reduction in simulation.
- β’
3GPP NR TTT values: 0, 40, 64, 80, 100, ..., 5120 ms
- β’
Measurement period: 200 ms (SSB-based L3 filtering)
- β’
Hysteresis margin: 0-30 dB in 0.5 dB steps
Common Mistake: CUSUM Requires Knowing the Post-Change Density
Mistake:
Designing a CUSUM for a Gaussian mean-shift detector by picking "because that seems reasonable," then wondering why the ARLs are way off when the real shift is .
Correction:
The CUSUM increment depends on the assumed . Performance is optimal only when the assumed post- change density matches reality. When the shift size is unknown, use the GLR-CUSUM (generalized likelihood ratio): replace the fixed LLR with its supremum over a parameter set, at the cost of storage. Alternatively, use a windowed GLR for bounded complexity.
Common Mistake: ARL Is Not a Probability
Mistake:
Confusing with the false-alarm probability, or reporting "the CUSUM has , so ."
Correction:
There is no fixed sample size, so there is no per-decision false-alarm probability. is the expected time to the first false alarm β with units of samples, not a probability. A useful interpretation: if the system runs indefinitely with no change, false alarms occur at rate per sample, so plays the role of a false-alarm rate in continuous-time analogues.
Historical Note: Page and the Birth of Change Detection
1954E. S. Page introduced CUSUM charts in 1954 for industrial quality control. Shewhart's chart (1931) would flag any point outside , but it had poor detection power for small persistent mean shifts. Page's insight was to accumulate the deviations rather than threshold each individually.
Page's 1954 paper is remarkably short (six pages) and contains neither the likelihood-ratio interpretation nor the optimality theory. Those would arrive 17 years later with Lorden's 1971 proof that CUSUM is minimax with respect to worst-case change points, refined by Moustakides in 1986 to exact minimax optimality. Page himself moved on to operations research and never returned to the topic.
Historical Note: From Asymptotic to Exact Optimality
1971-1986Lorden (1971) introduced the worst-case detection delay and proved that CUSUM is asymptotically minimax β it attains the best possible delay up to first order as . Moustakides (1986) closed the gap by showing CUSUM is exactly minimax for every finite , using a dynamic programming argument.
This result placed CUSUM on the same footing as the SPRT: an exactly optimal sequential procedure, not merely an asymptotic one. It remains one of the cleanest optimality statements in sequential statistics.
Quick Check
What is the role of the reflection at zero in the CUSUM statistic?
To prevent the statistic from overflowing
To restart the evidence count after periods consistent with
To guarantee the CUSUM is unbiased
To make the statistic Gaussian
Reflection ensures past -consistent evidence does not inflate the threshold-hitting time after a real change.
Quick Check
Doubling the CUSUM threshold approximately multiplies by:
2
(quadratic in the original ARL)
Since , doubling gives , i.e., squares the original ARL.