Ferkans — Interactive Telecom Tutor

ex-cc-ch01-01

Easy

A shared-link network has $K = 5$ users and $N = 10$ files, each of size $F$ bits. Each user has cache size $M = 2$ files. Compute the uncoded worst-case delivery rate and the (preview) MAN delivery rate.

Show Hint

Uncoded: $R = K(1 - M/N)$ .

MAN: $R = K(1 - M/N)/(1 + KM/N)$ .

Solution

Uncoded

$R_{\text{unc}} = 5 \cdot (1 - 2/10) = 5 \cdot 0.8 = 4$ files.

MAN

$R_{\text{MAN}} = 5 \cdot 0.8 / (1 + 5 \cdot 2/10) = 4/2 = 2$ files. The gain factor is $1 + KM/N = 2$ .

ex-cc-ch01-02

Easy

Prove that for any caching scheme on a shared-link network, $R(M) \leq K$ and $R(M = N) = 0$ .

Show Hint

Consider the trivial delivery strategy.

If $M = N$ , every user holds a copy of the library.

Solution

Upper bound R ≤ K

The server can always unicast one file per user, requiring at most $K$ file units on the shared link.

R(N) = 0

If each user caches the entire library, every requested file is a hit; no delivery is needed.

ex-cc-ch01-03

Easy

Under Zipf demand with exponent $\alpha$ and library $N = 1000$ , show that the cache-hit ratio of popularity caching equals $M/N$ only when $\alpha = 0$ (uniform demand). Compute the hit ratio at $M = 100$ for $\alpha = 0, 1$ .

Show Hint

Uniform demand means every $P_n = 1/N$ .

Harmonic number $H_N \approx \ln N + \gamma$ for $\alpha = 1$ .

Solution

Uniform case

$h = \sum_{n=1}^{M} 1/N = M/N = 0.1$ .

α = 1 case

$h = H_{M,1} / H_{N,1} \approx \ln(100)/\ln(1000) = \log_{10} 100 / \log_{10} 1000 \approx 0.67$ .

ex-cc-ch01-04

Easy

For the shared-link network, explain in one paragraph why the placement phase cannot depend on the demand vector $\mathbf{d}$ .

Show Hint

The placement happens off-peak, before demands are revealed.

Solution

Answer

The placement phase populates caches during off-peak periods, when user requests have not yet been issued. The server does not know $\mathbf{d}$ at placement time. A scheme whose placement depended on $\mathbf{d}$ would require the server to anticipate every possible demand pattern, which defeats the purpose of caching — the whole point is to pre-position content robustly.

ex-cc-ch01-05

Easy

Compute the coded caching gain $t = K M/N$ for: (a) $K = 100$ , $M = 10$ , $N = 200$ ; (b) $K = 1000$ , $M = 100$ , $N = 10000$ .

Show Hint

Plug into the formula.

Solution

Compute

(a) $t = 100 \cdot 10 / 200 = 5$ . Each XOR message serves $6$ users. (b) $t = 1000 \cdot 100 / 10000 = 10$ . Each XOR message serves $11$ users.

ex-cc-ch01-06

Medium

The scaling gap. Consider two regimes with fixed memory ratio $\mu = M/N = 0.1$ . (a) Compute $R_{\text{unc}}$ and $R_{\text{MAN}}$ for $K = 10$ , and for $K = 10^4$ . (b) What is the asymptotic ratio $R_{\text{unc}}/R_{\text{MAN}}$ as $K \to \infty$ ?

Show Hint

Plug in the formulas.

For the limit: $R_{\text{MAN}} \to (1-\mu)/\mu$ as $K \to \infty$ .

Solution

K = 10

$R_{\text{unc}} = 9$ . $R_{\text{MAN}} = 9/(1+1) = 4.5$ . Ratio: 2.

K = 10^4

$R_{\text{unc}} = 9000$ . $R_{\text{MAN}} = 9000/(1001) \approx 8.99$ . Ratio: $\approx 1001$ .

Asymptotic ratio

$R_{\text{unc}}/R_{\text{MAN}} = 1 + K\mu \to \infty$ . The gap scales linearly in $K$ — the exact statement of why coded caching matters at CDN scale.

ex-cc-ch01-07

Medium

A cache is refreshed over a 24-hour day. Peak demand is 10% of the day (2.4 hours); off-peak the link is idle. In 24 hours, the server must deliver the library once (library refresh), plus $T_p / t_{\text{use}}$ peak-hour delivery rounds. Let $R_{\text{placement}}$ and $R_{\text{delivery}}$ be the time-averaged rates. Derive the total traffic cost in files per user per day under popularity caching with $M/N = 0.1$ , $K = 100$ users, $N = 1000$ files, and 60-second delivery rounds.

Show Hint

Placement cost per day: $M$ files per user (total $KM$ ) pushed once.

Delivery cost per peak round: $K(1-M/N)$ files on the link.

Number of peak rounds per day: $T_p / t_{\text{round}} = 8640 / 60 = 144$ .

Solution

Placement traffic

One library refresh per day: $K \cdot M = 100 \cdot 100 = 10{,}000$ files broadcast during off-peak. This costs the link $KM = 10^4$ file units per day, but off-peak capacity is not the bottleneck.

Delivery traffic

Peak period: $T_p = 8640$ s. 144 rounds. Per round: $R_{\text{unc}} = K(1-M/N) = 90$ files. Total peak-period traffic: $144 \cdot 90 = 12{,}960$ file-units on the bottleneck link.

Comparison with MAN

MAN: per round, $R = 90/11 \approx 8.2$ files. Total peak traffic: $144 \cdot 8.2 = 1180$ file-units. MAN reduces peak-bottleneck traffic by a factor of $\sim 11$ , but placement traffic is the same. This is why the engineering metric is peak load, not total traffic.

ex-cc-ch01-08

Medium

Show that the worst-case delivery rate under popularity caching satisfies $R_{\text{pop,worst}} = K (1 - M/N)$ when $K \leq N$ , regardless of the demand distribution. In particular, popularity caching does not help in the worst case.

Show Hint

Consider demands in which every user requests a file not in the cache.

With $K \leq N - M$ , such a demand vector exists.

Solution

Construct worst demand

Because popularity caching places the same $M$ files in every user's cache (namely, $\{W_1, \ldots, W_M\}$ ), the worst case is when every user requests a distinct uncached file, say $d_k = M + k$ for $k = 1, \ldots, K$ . This is feasible whenever $K \leq N - M$ .

Compute rate

Under uncoded delivery, each miss costs one full unicast. All $K$ requests miss, so $R = K(1 - M/N)$ when $M$ of the $N$ files are cached but the $K$ requested files are among the $N - M$ uncached ones. For this demand, the " $-M/N$ " local caching gain in the formula reduces to a gain of $0$ — caching did not help at all.

Interpretation

Popularity caching optimizes the expected rate under a prior distribution, but provides no worst-case robustness. Coded caching (Ch 2) has provably better worst-case performance.

ex-cc-ch01-09

Medium

Derive the expected rate under popularity caching for i.i.d. Zipf demands: $R_{\text{exp}} = K (1 - H_{M,\alpha}/H_{N,\alpha})$ where $H_{K,\alpha} = \sum_{n=1}^{K} n^{-\alpha}$ . Show that for fixed $M/N$ , $R_{\text{exp}} / R_{\text{unc,worst}} \to 1$ as $\alpha \to 0$ .

Show Hint

$\alpha = 0$ : uniform demand, $H_{K,0} = K$ .

Uniform case: $h = M/N$ , $R_{\text{exp}} = K(1 - M/N)$ .

Solution

General formula

Each user independently misses with probability $1 - H_{M,\alpha}/ H_{N,\alpha}$ , and each miss costs one unicast. Summing over users: $R_{\text{exp}} = K(1 - H_{M,\alpha}/H_{N,\alpha})$ .

Uniform limit

As $\alpha \to 0$ , $H_{K,\alpha} \to K$ . Then $H_{M,\alpha}/H_{N,\alpha} \to M/N$ , and $R_{\text{exp}} \to K(1 - M/N) = R_{\text{unc,worst}}$ . So uniform demand is the worst case for popularity caching — there is nothing to exploit.

ex-cc-ch01-10

Medium

Consider a network where each user has a different-sized cache: user $k$ has cache $M_k$ , with $\sum_k M_k = K \cdot M$ fixed. Show that the minimum worst-case rate under any uncoded scheme is still $K(1 - M/N)$ . That is, heterogeneous caches do not help the uncoded worst case.

Show Hint

Worst-case demands are still distinct file per user.

User $k$ 's cache covers fraction $M_k/N$ of its demand.

Solution

Sum local gains

Worst-case: all users request distinct files. User $k$ 's rate is at least $(1 - M_k/N)$ (the local miss rate). Total: $R \geq \sum_k (1 - M_k/N) = K - \sum_k M_k/N = K - KM/N = K(1-M/N)$ .

Interpretation

Under uncoded delivery, only local gains count, and the total local gain is determined by the average cache size $M$ , not by its distribution. Coded caching (Chapter 13) breaks this — heterogeneous caches do help coded delivery, because coded messages can serve users with different caches differently.

ex-cc-ch01-11

Medium

Consider $K = 2$ , $N = 2$ , $M = 1$ , and files of $F = 2$ bits each: $W_1 = (a_1, a_2)$ , $W_2 = (b_1, b_2)$ . Propose a coded placement + delivery that achieves the MAN rate $R_{\text{MAN}} = 2 \cdot (1 - 1/2)/(1 + 2 \cdot 1/2) = 1/2$ for the demand $(d_1, d_2) = (1, 2)$ .

Show Hint

Split: user 1 caches 'user-1 subfiles of every file'; user 2 caches 'user-2 subfiles'.

Send the XOR of what user 1 lacks and what user 2 lacks.

Solution

Placement

User 1 caches $(a_1, b_1)$ (the 'user-1-indexed' halves of each file). User 2 caches $(a_2, b_2)$ . Each cache holds $F = 2$ bits — exactly the allowed budget $M \cdot F/F = 1$ file.

Delivery for $(1, 2)$

User 1 wants $W_1 = (a_1, a_2)$ ; user 1 already has $a_1$ , needs $a_2$ . User 2 wants $W_2 = (b_1, b_2)$ ; user 2 already has $b_2$ , needs $b_1$ . Server sends one XOR bit: $a_2 \oplus b_1$ . User 1 recovers $a_2 = (a_2 \oplus b_1) \oplus b_1$ using the $b_1$ in its cache; user 2 recovers $b_1$ similarly. Total: 1 bit = $1/2$ file. Match.

Check worst case

The same placement works for the other demand patterns; worst-case rate is $1/2$ file. This is the $K = 2$ MAN scheme in miniature.

ex-cc-ch01-12

Hard

Information-theoretic cut-set lower bound (preview). Prove that for any shared-link scheme, $R^*(M) \geq s - \frac{sM}{\lfloor N/s \rfloor}$ for every integer $s \in [1, \min(K, N)]$ . This is the classical cut-set bound of Maddah-Ali–Niesen 2014, which we will develop fully in Chapter 3.

Show Hint

Pick $s$ users and consider the cut separating them from the rest.

Use Fano's inequality and entropies of subsets of files.

The key is that the caches + delivery message together must determine $s$ files.

Solution

Choose a cut

Fix $s$ users, say $\{1, \ldots, s\}$ . Consider $\lfloor N/s \rfloor$ demand vectors, each requesting $s$ distinct files from the library, such that the union of requested files covers a subset $\mathcal{F}$ with $|\mathcal{F}| \geq \lfloor N/s \rfloor \cdot s$ .

Apply Fano

The caches $\mathcal{Z}_{1}, \ldots, \mathcal{Z}_{s}$ plus the delivery messages across these $\lfloor N/s \rfloor$ rounds must determine all files in $\mathcal{F}$ , of total size $\lfloor N/s \rfloor \cdot s \cdot F$ bits.

Bound entropies

$\mathrm{H}(\mathcal{Z}_{1}, \ldots, \mathcal{Z}_{s}) \leq s M F$ . The total delivery bit budget is $\lfloor N/s \rfloor \cdot R F$ . Together they must support $\lfloor N/s \rfloor s F$ bits of file: $s M F + \lfloor N/s \rfloor \cdot R F \geq \lfloor N/s \rfloor s F.$ Rearranging: $R \geq s - sM/\lfloor N/s\rfloor$ . $\blacksquare$

ex-cc-ch01-13

Hard

Non-uniform demand and the end of MAN. Suppose demand is Zipf with $\alpha = 1.5$ (very concentrated). Show by example that popularity caching can achieve an expected rate strictly lower than the MAN worst-case rate for $K = 100$ , $N = 1000$ , $M = 100$ .

Show Hint

MAN is a worst-case result; popularity is average-case under a prior.

Compute $h = H_{M,1.5}/H_{N,1.5}$ numerically.

Solution

MAN worst-case

$R_{\text{MAN}} = 100 \cdot 0.9 / (1 + 10) = 90/11 \approx 8.18$ .

Popularity expected

$\zeta(1.5) \approx 2.612$ ; $H_{100,1.5} \approx 2.44$ ; $H_{1000,1.5} \approx 2.56$ . So $h \approx 0.953$ . $R_{\text{pop,exp}} = 100 \cdot (1 - 0.953) = 4.7$ .

Conclusion

Under heavy Zipf, popularity's expected rate (4.7) beats MAN's worst-case rate (8.18). This is not a contradiction: MAN is worst-case optimal, but most demand vectors are not worst case. The right comparison is average-case MAN vs. popularity, which is subtle and chapter-13 material. The broader lesson: the "right" scheme depends on whether you want worst-case robustness (MAN) or average-case efficiency (popularity, or a decentralized hybrid).

ex-cc-ch01-14

Challenge

Consider a shared-link network where users have independent caches but the library grows with time (new files arrive at rate $\lambda$ files per round, old files leave at the same rate). Model the effective popularity distribution as the steady-state of this library evolution and analyze the tradeoff between placement refresh rate and expected delivery rate under popularity caching. Discuss what this implies for hybrid popularity/coded schemes in real CDNs.

Show Hint

Model library turnover as a birth–death process.

Cache must be updated when popular files are replaced.

Consider the tradeoff between placement refresh overhead and delivery savings.

Solution

Model turnover

Let $\pi_n$ be the steady-state popularity of file $n$ in a library that turns over at rate $\lambda$ . If placement refreshes at rate $\mu$ and demand is i.i.d. $\pi$ , the cache always contains the most popular $M$ files up to a lag of $1/\mu$ .

Expected rate

$R_{\text{eff}} = K(1 - h_{\text{eff}})$ where $h_{\text{eff}} = \sum_{n=1}^M \pi_n - O(\lambda / \mu)$ . The last term accounts for popular files that have changed but not yet been refreshed into caches.

Cost accounting

Placement traffic: $\mu K M F$ per round. Delivery traffic: $K(1 - h_{\text{eff}}) F$ per round. Minimize the sum: $\mu^* = \sqrt{\lambda K / (KM)} = \sqrt{\lambda / M}.$ Real CDNs run refresh windows of minutes to hours, matching this optimum for typical libraries.

Hybrid implications

For a coded-caching system, the refresh overhead is even higher — the $\binom{K}{t}$ subfiles must all be kept consistent. This is why production CDNs still use popularity caching: the operational complexity of coded refresh exceeds the delivery savings for high-churn libraries. The research question is: can we design a coded scheme whose subfiles are stable under library turnover? Chapter 20 (online coded caching) treats this.

ex-cc-ch01-15

Challenge

A fundamental question. Prove or disprove: for any demand distribution $P$ (not just uniform), the expected delivery rate of any (possibly coded) scheme with per-user cache $M$ is bounded below by $K(1 - H_{\text{Zipf}}(M))/(1 + K M/N)$ , where $H_{\text{Zipf}}$ is the concentration correction specific to distribution $P$ .

Show Hint

This is an open problem in coded caching — file it as a research direction.

The difficulty is that popularity structure can both help coded placement (by focusing caching on popular files) and hurt (by concentrating demand on a subset).

Solution

State of the art

This question is open in full generality. For two-class demand (popular + unpopular), characterization was obtained by Hachem et al. 2017. For general Zipf demand, only order-optimal schemes are known (Zhang–Pedarsani–Ji 2015, Ji–Tulino–Llorca–Caire 2017). See Chapter 13 for the state of the art.

Why this is hard

Coded caching's gain comes from multicasting across users with different demands. Concentrated popularity reduces the diversity of demands, narrowing the opportunity for coded gain. But concentrated popularity also makes uncoded caching better. The crossover is non-trivial and depends on $K, M, N, P$ jointly.

Research direction

Open problem: characterize $R^*(M, P)$ exactly. Progress has been made in the two-extreme case ( $P$ uniform, $P$ delta) and scaling regimes. This is an excellent PhD project for a student who enjoys both combinatorics and optimization.

Exercises

ex-cc-ch01-01

Uncoded

MAN

ex-cc-ch01-02

Upper bound R ≤ K

R(N) = 0

ex-cc-ch01-03

Uniform case

α = 1 case

ex-cc-ch01-04

Answer

ex-cc-ch01-05

Compute

ex-cc-ch01-06

K = 10

K = 10^4

Asymptotic ratio

ex-cc-ch01-07

Placement traffic

Delivery traffic

Comparison with MAN

ex-cc-ch01-08

Construct worst demand

Compute rate

Interpretation

ex-cc-ch01-09

General formula

Uniform limit

ex-cc-ch01-10

Sum local gains

Interpretation

ex-cc-ch01-11

Placement

Delivery for $(1, 2)$

Check worst case

ex-cc-ch01-12

Choose a cut

Apply Fano

Bound entropies

ex-cc-ch01-13

MAN worst-case

Popularity expected

Conclusion

ex-cc-ch01-14

Model turnover

Expected rate

Cost accounting

Hybrid implications

ex-cc-ch01-15

State of the art

Why this is hard

Research direction