Non-Stationary Demand and Concept Drift

Real Demand Is Non-Stationary

The i.i.d. assumption underlying log-regret bounds (Β§20.2) is violated by real content demand:

  • Daily / weekly periodicity. Work-hour news vs evening entertainment; weekday vs weekend.
  • Viral spikes. A trending video's popularity rises and falls on ∼\sim 1-week timescales.
  • Long-tail drift. Catalog popularity distribution reshapes with new content uploads.
  • Seasonal shifts. Summer sports vs winter sports; movie release calendar.

Under non-stationary demand, standard log-regret bounds fail. Regret against a single static optimum becomes unbounded. The correct benchmark is dynamic regret: compare against a piecewise-constant or slowly-changing reference.

Definition:

Dynamic Regret

For a non-stationary demand process, dynamic regret is RTdynβ€…β€Š=β€…β€Šβˆ‘t=1TC(t)(A(t))βˆ’βˆ‘t=1Tmin⁑ZC(t)(Z).\mathcal{R}_T^\text{dyn} \;=\; \sum_{t=1}^T C^{(t)}(\mathcal{A}^{(t)}) - \sum_{t=1}^T \min_\mathbf{Z} C^{(t)}(\mathbf{Z}). The reference is the best per-round placement (knows d(t)\mathbf{d}^{(t)} before committing). Much stronger than static regret.

Under non-stationarity measured by total variation VTV_T (how much the optimal placement drifts over TT rounds), dynamic regret is bounded by O(T2/3VT1/3)O(T^{2/3} V_T^{1/3}) in general.

Dynamic regret can exceed static regret because the reference adapts. The bound T2/3VT1/3T^{2/3} V_T^{1/3} reveals the tradeoff: slow drift (VTV_T small) allows tight regret; fast drift (VT→TV_T \to T) makes learning impossible.

Theorem: Dynamic Regret for Coded Online Caching

For coded online caching under non-stationary demand with path length VT=βˆ‘tβˆ₯Ο€(t)βˆ’Ο€(tβˆ’1)βˆ₯V_T = \sum_t \|\pi^{(t)} - \pi^{(t-1)}\|, there exists an algorithm (with restart / tracking mechanism) achieving RTdynβ€…β€Š=β€…β€ŠO ⁣(T2/3VT1/3log⁑N).\mathcal{R}_T^\text{dyn} \;=\; O\!\left(T^{2/3} V_T^{1/3} \sqrt{\log N}\right). For slow drift (VT=TΞ³V_T = T^\gamma with Ξ³<1\gamma < 1), regret is sublinear.

The algorithm uses a sliding-window estimate of recent demands or a restart mechanism when drift is detected. Standard machinery from online learning with drifting environments. The T2/3T^{2/3} scaling is inherent: tracking a changing target is harder than tracking a fixed one.

Adaptive vs Static Placement Over Time

Delivery rate convergence of adaptive online caching vs static MAN placement. Adaptive approaches oracle-optimal as it learns. Static stays at MAN baseline forever.

Parameters
10
0.2

Drift Detection Techniques

Detecting non-stationarity in real-time enables adaptive schemes to restart or re-weight:

  • Sliding window. Use the last WW observations; drop older. Simple; requires tuning WW.
  • Exponential weighting. Weight observation tt by eβˆ’Ξ»(Tβˆ’t)e^{-\lambda (T - t)}. Continuous; no window boundary.
  • Change-point detection. Statistical tests (CUSUM, Shiryaev-Roberts) to identify drift events. Restart the estimator at detected change points.
  • ML-based predictors. Neural networks predicting Ο€(t+1)\pi^{(t+1)} from recent history. Powerful but harder to theoretically analyze.

Practical systems use combinations: windowed estimation + ML prediction for popular-shift forecasting.

πŸ”§Engineering Note

How Production CDNs Handle Drift

Real CDNs' approaches to non-stationary demand:

  1. Fast admission, slow eviction. Admit new items quickly; evict only when cache full. LRU does this implicitly.
  2. Popularity prediction. ML models (on publisher metadata: upload time, social signals) predict popularity. Cache forward-looking.
  3. Geographic specialization. Edge caches specialize per region; trending local content cached locally.
  4. Pre-positioning. For scheduled events (live sports, movie releases), preload caches.

Coded online caching could augment each with multicasting gain β€” but integration requires coded-delivery-aware admission logic, not yet commercialized.

Practical Constraints
  • β€’

    Akamai: regional pre-positioning for scheduled events

  • β€’

    Netflix: Popularity-based prediction + edge caching

  • β€’

    Meta/TikTok: Heavy ML-based popularity prediction

  • β€’

    Coded augmentation: research frontier

,

Common Mistake: Don't Assume Stationarity Blindly

Mistake:

Using i.i.d. log-regret bounds to claim performance in non-stationary regimes.

Correction:

Log-regret is proven under i.i.d. Under concept drift, log-regret may be infinite. Always verify the assumption matches the workload.

If drift is suspected: (i) measure drift rate VTV_T from history; (ii) use dynamic-regret bounds; (iii) allocate algorithmic budget for tracking.

Key Takeaway

Real demand drifts β€” \\mathcal{R}_T^\\text{dyn} = O(T^{2/3} V_T^{1/3}) is the right bound. Algorithmic responses include sliding windows, exponential weighting, and change-point detection. Production CDNs handle drift with LRU + ML prediction; coded augmentation is an open research opportunity.