Exercises
ex-ch20-e01
EasyDefine regret. What does "sublinear" mean for ?
Regret compares cumulative algorithm cost to best hindsight.
Sublinear: grows slower than .
Regret
. Extra cumulative cost vs best-hindsight placement.
Sublinear
as . E.g., (adversarial), (stochastic). Average regret per round : online matches offline asymptotically.
ex-ch20-e02
EasyFor , , : compute the regret bound for adversarial online coded caching.
Formula
.
Value
.
Per-round
Average extra cost per round files/use. About 3% over MAN rate in this regime.
ex-ch20-e03
MediumShow that LRU has regret against the offline coded MAN baseline.
LRU lacks coded multicasting gain entirely.
Coded MAN rate uncoded rate / .
LRU per-round cost
LRU delivers uncoded: per round (if hit rate ).
MAN per-round
.
Gap
per round. Constant gap β cumulative regret.
Interpretation
LRU fundamentally cannot achieve the multicasting gain. Structural, not tunable.
ex-ch20-e04
MediumDerive the optimal FTPL learning rate for our setting.
Standard FTPL: .
Standard FTPL
For finite action set of size : .
Applied to CC
Action = placement choice; with per-user MAN, up to choices per user. . So .
Approximation
where is binary entropy. .
ex-ch20-e05
MediumFor i.i.d. Zipf- demand, estimate the convergence time for empirical popularity estimation to within 1% KL.
Zipf: fastest-to-converge among distributions with same support.
KL rate .
Convergence rate
Under Zipf-: effective support size (concentrated on popular).
For 1% KL
KL . For KL = 0.01: .
Numerical
: effective support . rounds for 1% KL. Very fast.
ex-ch20-e06
HardProve Theorem 20.2.2 (logarithmic regret under i.i.d.).
Empirical concentration + plug-in estimate.
Per-round regret from estimation error.
Setup
At round , cache placed based on . Regret per round: cost with -optimal minus cost with -optimal.
Taylor expansion
Cost is smooth in : regret near optimum.
Concentration
.
Total
.
ex-ch20-e07
MediumCompare adversarial and stochastic regret bounds at rounds. Which is tighter? When would you expect each to dominate?
Adversarial
(for ).
Stochastic
. Vastly tighter.
When stochastic applies
Strictly i.i.d. demand (rare in practice). Or piecewise-stationary where within-piece bound applies.
When adversarial applies
Anything else: worst-case bound always holds. Pessimistic for real workloads.
ex-ch20-e08
HardDesign an adaptive algorithm for piecewise-stationary demand with change points over rounds. What is the optimal regret?
Piecewise-stationary model
Demand is i.i.d. within each of pieces. Change points unknown.
Change-point detection
Restart estimator when detected. Cost: per piece.
Total regret
from within-piece + from transitions. Total .
Interpretation
For : near-stochastic performance. For : degrades to adversarial.
ex-ch20-e09
MediumWhy does FTPL work (even though the action space is discrete)?
FTPL adds noise to explore.
Sample complexity bounds.
Discrete action space
FTPL selects from a finite set of placements; per round is feasible.
Noise for exploration
Exponential perturbation introduces randomness: rarely-chosen placements still selected occasionally, preventing getting stuck.
Regret analysis
Standard FTPL regret . Scales with space but logarithmically β works for huge spaces.
ex-ch20-e10
HardDesign a coded LRU hybrid: how would you combine LRU's admission with MAN-style delivery structure?
Admission via LRU
Standard LRU decides which files to cache locally.
Delivery via MAN on cached structure
On a miss: server coordinates with other users' caches; sends coded XOR combinations where possible.
Coordination complexity
Requires server to know each cache's current LRU state. Metadata-heavy; not how real CDN works. Alternative: decentralized caching + coded delivery.
Research status
Hybrid schemes studied in the literature β variable gains depending on workload. Active area.
ex-ch20-e11
MediumThe sliding window for non-stationary learning has trade-off: narrow vs wide. Derive the optimal window size.
Variance-bias tradeoff: narrow window = high variance; wide = high bias.
Tradeoff
Narrow : empirical estimate has variance . Wide : drift accumulates, bias .
Optimal W
Balance: .
Resulting regret
. Matches general dynamic-regret bound.
ex-ch20-e12
MediumExplain how change-point detection (CUSUM) can improve regret.
CUSUM
Cumulative sum of deviations; alarm when sum exceeds threshold. Detects abrupt shifts in demand distribution.
Algorithmic integration
On alarm: restart empirical estimator. Within stable phase: log-regret applies.
Regret benefit
Restart cost: constant per change point. If change points: total regret . Much better than if changes are discrete rather than smooth drift.
ex-ch20-e13
HardHow does online coded caching interact with demand privacy (Ch 12)? Can both be achieved simultaneously?
Online + privacy
Online: cache updates per round. Privacy: mask demands. Two orthogonal concerns.
Composition
Apply Wan-Caire demand masking per round. Server sees shuffled demands; updates cache based on empirical distribution of shuffled demands.
Privacy preserved
If masking is consistent: server's empirical estimate is over shuffled index space. Cache placement pattern robust under uniform permutation.
Regret bound
Unchanged from non-private online: . Privacy is free in the online setting too.
ex-ch20-e14
HardIn a real CDN, can we bound the "coded-caching regret" if we use LRU for admission and coded delivery after? Analyze.
Hybrid description
Admission: standard LRU (uncoordinated). Delivery: on miss, server picks best coded combinations over requesting users.
Best-case delivery
Coded delivery opportunities depend on cache overlap. LRU's random / per-user structure doesn't guarantee MAN's combinatorial overlap.
Regret
Coded gain reduced: effective where is the overlap rate (not ). Typically .
Recovery
Hybrid is better than uncoded LRU, but worse than coded MAN with proper coordinated placement.
ex-ch20-e15
HardState one important open problem in online coded caching.
Option A: Instance-dependent regret
Current bounds are worst-case. Can we tighten regret based on observed problem instance (e.g., "easy" traces)?
Option B: Coded + Hierarchical
Online coded caching in a hierarchical network (edge + regional caches). Each level has its own online dynamics. Joint algorithm unknown.
Option C: Fairness
Coded delivery favors users with overlapping demands. Users with rare requests may be systematically under-served. Fair online coded caching is open.