Quality-of-Experience and Chunked Delivery

Why QoE, Not Just Rate?

The MAN rate is a network-side metric: files per channel use. Users care about a different metric: Quality of Experience (QoE). QoE captures:

  • Startup latency. Time from play-click to first frame.
  • Rebuffering. Frequency and duration of pauses.
  • Average bitrate. Perceived video resolution.
  • Bitrate switching. Frequency and magnitude of quality jumps.

Production streaming optimizes for QoE; so should QoE-aware coded caching. This section develops the QoE-caching coupling, connecting the information-theoretic 1+KΞΌ1 + K\mu gain to user-perceived quality.

Definition:

Standard QoE Model

A canonical QoE metric (Yin-Jindal 2015): QoEβ€…β€Š=β€…β€Šβˆ‘t=1Tq(t)βˆ’Ξ»sβˆ‘t=1Tβˆ’1∣q(t+1)βˆ’q(t)βˆ£βˆ’Ξ»rβ‹…Trebufβˆ’Ξ»dβ‹…Tstartup,\text{QoE} \;=\; \sum_{t=1}^T q^{(t)} - \lambda_s \sum_{t=1}^{T-1} |q^{(t+1)} - q^{(t)}| - \lambda_r \cdot T_{\text{rebuf}} - \lambda_d \cdot T_\text{startup}, where:

  • q(t)q^{(t)}: average bitrate watched at time tt (higher is better).
  • Ξ»s\lambda_s: smoothness penalty (bitrate switching is jarring).
  • Ξ»r\lambda_r: rebuffering penalty (dominant concern).
  • TrebufT_\text{rebuf}: total rebuffering duration.
  • TstartupT_\text{startup}: startup delay.

The coefficients Ξ»s,r,d\lambda_{s,r,d} are calibrated to user studies: typically Ξ»rβ‰ˆ1\lambda_r \approx 1 per second of rebuffering (very strong penalty); Ξ»s,Ξ»dβ‰ˆ0.1\lambda_s, \lambda_d \approx 0.1.

QoE is highly sensitive to rebuffering: users tolerate lower bitrates but hate stalls. This drives caching design toward aggressive pre-fetch and chunk availability.

QoE vs Cache Size

Composite QoE (bitrate, startup, rebuffering) vs cache size. Coded caching boosts QoE via reduced miss rate, fewer rebuffers, and faster startup.

Parameters
20
1000

Definition:

Chunked Video Delivery

In chunked delivery (standard in HLS, DASH), each video is divided into chunks (typically 2-10 seconds each). Chunks are the units of caching and delivery. For a video of duration TvT_v seconds with nc=Tv/Ο„cn_c = T_v / \tau_c chunks (chunk length Ο„c\tau_c):

  • Each chunk encoded at LL quality levels.
  • Client fetches chunks sequentially.
  • Server can deliver any subset of chunks in any order.

Coded chunk caching: Apply MAN at chunk granularity. Each cache stores a subset of chunks; coded XOR delivery over (t+1)(t+1)-subsets of users requesting different chunks.

Chunked structure refines the MAN analysis: instead of file-granular subfiles, we work with chunk-granular placement. Subpacketization benefits from chunk structure: typical video chunks are large enough (∼5\sim 5 MB) to support practical combinatorial splits.

Theorem: Chunk-Level Coded Caching Rate

For chunk-level coded caching with ncn_c chunks per video, per-user cache MM (in chunks), and uniform chunk demand, the achievable delivery rate is Rβ€…β€Š=β€…β€ŠK(1βˆ’ΞΌ)1+KΞΌβ‹…chunksΒ perΒ second.R \;=\; \frac{K(1 - \mu)}{1 + K\mu} \cdot \text{chunks per second}. Same as MAN, applied at chunk granularity.

Chunks play the role of "files" in the basic MAN analysis. The rate formula is unchanged. Advantage: chunks are smaller, subpacketization is more practical.

Example: YouTube-Scale Chunk Caching

YouTube-style streaming: K=1000K = 1000 concurrent viewers in a geographic region, N=107N = 10^7 videos, each 300 chunks of 5 seconds. Per-user cache 100 chunks. Analyze delivery rate and QoE.

,
πŸ”§Engineering Note

Production QoE Optimization

How production services optimize QoE:

  1. Chunk prefetch. Client fetches 2-3 chunks ahead; buffer smooths over network variation. Coded caching reduces per- chunk server load, easing prefetch.
  2. Adaptive bitrate switching. Client selects quality based on network estimation. Caching makes bitrate upgrades cheap (base layer already cached).
  3. Regional CDN tiers. Origin β†’ regional PoP β†’ ISP cache β†’ device. Chunks propagate down on first miss. Coded caching can be applied at each tier.
  4. ML-based popularity prediction. YouTube/Netflix use ML to forecast which videos to cache where. Combines with coded caching opportunistically.

Status: Coded caching at chunk granularity is research-stage; production CDNs use LRU/LFU on chunks. Integration of coded layer is a 3-5 year practical roadmap.

Practical Constraints
  • β€’

    YouTube: ML prediction + LRU chunks

  • β€’

    Netflix Open Connect: static predictive + LRU

  • β€’

    Akamai: regional LRU + admission control

  • β€’

    Coded chunk caching: research stage

,

Common Mistake: Rate Optimization β‰  QoE Optimization

Mistake:

Assuming minimizing RR automatically maximizes QoE.

Correction:

QoE has multiple components:

  • Bitrate ∼(1βˆ’rateΒ penalty)\sim (1 - \text{rate penalty}). Lower rate β†’ higher bitrate.
  • Rebuffering ∼\sim variance of delivery. Lower rate doesn't help if delivery is bursty.
  • Startup ∼\sim first-chunk latency. Depends on cache hit, not delivery rate.

A caching scheme that reduces average rate but causes rebuffering spikes can reduce QoE. Practical schemes optimize expected bitrate - Ξ»rβ‹…\lambda_r \cdot rebuffering variance β€” not just average rate.

Key Takeaway

Video caching targets QoE, not just rate. QoE composites bitrate, startup, rebuffering, smoothness. Coded caching helps all four via reduced miss rate and multicast efficiency. Chunk-level coded caching is a practical refinement of MAN, suitable for integration with DASH. Production integration is a 3-5 year roadmap.