Exercises
ex-cc-ch01-01
EasyA shared-link network has users and files, each of size bits. Each user has cache size files. Compute the uncoded worst-case delivery rate and the (preview) MAN delivery rate.
Uncoded: .
MAN: .
Uncoded
files.
MAN
files. The gain factor is .
ex-cc-ch01-02
EasyProve that for any caching scheme on a shared-link network, and .
Consider the trivial delivery strategy.
If , every user holds a copy of the library.
Upper bound R ≤ K
The server can always unicast one file per user, requiring at most file units on the shared link.
R(N) = 0
If each user caches the entire library, every requested file is a hit; no delivery is needed.
ex-cc-ch01-03
EasyUnder Zipf demand with exponent and library , show that the cache-hit ratio of popularity caching equals only when (uniform demand). Compute the hit ratio at for .
Uniform demand means every .
Harmonic number for .
Uniform case
.
α = 1 case
.
ex-cc-ch01-04
EasyFor the shared-link network, explain in one paragraph why the placement phase cannot depend on the demand vector .
The placement happens off-peak, before demands are revealed.
Answer
The placement phase populates caches during off-peak periods, when user requests have not yet been issued. The server does not know at placement time. A scheme whose placement depended on would require the server to anticipate every possible demand pattern, which defeats the purpose of caching — the whole point is to pre-position content robustly.
ex-cc-ch01-05
EasyCompute the coded caching gain for: (a) , , ; (b) , , .
Plug into the formula.
Compute
(a) . Each XOR message serves users. (b) . Each XOR message serves users.
ex-cc-ch01-06
MediumThe scaling gap. Consider two regimes with fixed memory ratio . (a) Compute and for , and for . (b) What is the asymptotic ratio as ?
Plug in the formulas.
For the limit: as .
K = 10
. . Ratio: 2.
K = 10^4
. . Ratio: .
Asymptotic ratio
. The gap scales linearly in — the exact statement of why coded caching matters at CDN scale.
ex-cc-ch01-07
MediumA cache is refreshed over a 24-hour day. Peak demand is 10% of the day (2.4 hours); off-peak the link is idle. In 24 hours, the server must deliver the library once (library refresh), plus peak-hour delivery rounds. Let and be the time-averaged rates. Derive the total traffic cost in files per user per day under popularity caching with , users, files, and 60-second delivery rounds.
Placement cost per day: files per user (total ) pushed once.
Delivery cost per peak round: files on the link.
Number of peak rounds per day: .
Placement traffic
One library refresh per day: files broadcast during off-peak. This costs the link file units per day, but off-peak capacity is not the bottleneck.
Delivery traffic
Peak period: s. 144 rounds. Per round: files. Total peak-period traffic: file-units on the bottleneck link.
Comparison with MAN
MAN: per round, files. Total peak traffic: file-units. MAN reduces peak-bottleneck traffic by a factor of , but placement traffic is the same. This is why the engineering metric is peak load, not total traffic.
ex-cc-ch01-08
MediumShow that the worst-case delivery rate under popularity caching satisfies when , regardless of the demand distribution. In particular, popularity caching does not help in the worst case.
Consider demands in which every user requests a file not in the cache.
With , such a demand vector exists.
Construct worst demand
Because popularity caching places the same files in every user's cache (namely, ), the worst case is when every user requests a distinct uncached file, say for . This is feasible whenever .
Compute rate
Under uncoded delivery, each miss costs one full unicast. All requests miss, so when of the files are cached but the requested files are among the uncached ones. For this demand, the "" local caching gain in the formula reduces to a gain of — caching did not help at all.
Interpretation
Popularity caching optimizes the expected rate under a prior distribution, but provides no worst-case robustness. Coded caching (Ch 2) has provably better worst-case performance.
ex-cc-ch01-09
MediumDerive the expected rate under popularity caching for i.i.d. Zipf demands: where . Show that for fixed , as .
: uniform demand, .
Uniform case: , .
General formula
Each user independently misses with probability , and each miss costs one unicast. Summing over users: .
Uniform limit
As , . Then , and . So uniform demand is the worst case for popularity caching — there is nothing to exploit.
ex-cc-ch01-10
MediumConsider a network where each user has a different-sized cache: user has cache , with fixed. Show that the minimum worst-case rate under any uncoded scheme is still . That is, heterogeneous caches do not help the uncoded worst case.
Worst-case demands are still distinct file per user.
User 's cache covers fraction of its demand.
Sum local gains
Worst-case: all users request distinct files. User 's rate is at least (the local miss rate). Total: .
Interpretation
Under uncoded delivery, only local gains count, and the total local gain is determined by the average cache size , not by its distribution. Coded caching (Chapter 13) breaks this — heterogeneous caches do help coded delivery, because coded messages can serve users with different caches differently.
ex-cc-ch01-11
MediumConsider , , , and files of bits each: , . Propose a coded placement + delivery that achieves the MAN rate for the demand .
Split: user 1 caches 'user-1 subfiles of every file'; user 2 caches 'user-2 subfiles'.
Send the XOR of what user 1 lacks and what user 2 lacks.
Placement
User 1 caches (the 'user-1-indexed' halves of each file). User 2 caches . Each cache holds bits — exactly the allowed budget file.
Delivery for $(1, 2)$
User 1 wants ; user 1 already has , needs . User 2 wants ; user 2 already has , needs . Server sends one XOR bit: . User 1 recovers using the in its cache; user 2 recovers similarly. Total: 1 bit = file. Match.
Check worst case
The same placement works for the other demand patterns; worst-case rate is file. This is the MAN scheme in miniature.
ex-cc-ch01-12
HardInformation-theoretic cut-set lower bound (preview). Prove that for any shared-link scheme, for every integer . This is the classical cut-set bound of Maddah-Ali–Niesen 2014, which we will develop fully in Chapter 3.
Pick users and consider the cut separating them from the rest.
Use Fano's inequality and entropies of subsets of files.
The key is that the caches + delivery message together must determine files.
Choose a cut
Fix users, say . Consider demand vectors, each requesting distinct files from the library, such that the union of requested files covers a subset with .
Apply Fano
The caches plus the delivery messages across these rounds must determine all files in , of total size bits.
Bound entropies
. The total delivery bit budget is . Together they must support bits of file: Rearranging: .
ex-cc-ch01-13
HardNon-uniform demand and the end of MAN. Suppose demand is Zipf with (very concentrated). Show by example that popularity caching can achieve an expected rate strictly lower than the MAN worst-case rate for , , .
MAN is a worst-case result; popularity is average-case under a prior.
Compute numerically.
MAN worst-case
.
Popularity expected
; ; . So . .
Conclusion
Under heavy Zipf, popularity's expected rate (4.7) beats MAN's worst-case rate (8.18). This is not a contradiction: MAN is worst-case optimal, but most demand vectors are not worst case. The right comparison is average-case MAN vs. popularity, which is subtle and chapter-13 material. The broader lesson: the "right" scheme depends on whether you want worst-case robustness (MAN) or average-case efficiency (popularity, or a decentralized hybrid).
ex-cc-ch01-14
ChallengeConsider a shared-link network where users have independent caches but the library grows with time (new files arrive at rate files per round, old files leave at the same rate). Model the effective popularity distribution as the steady-state of this library evolution and analyze the tradeoff between placement refresh rate and expected delivery rate under popularity caching. Discuss what this implies for hybrid popularity/coded schemes in real CDNs.
Model library turnover as a birth–death process.
Cache must be updated when popular files are replaced.
Consider the tradeoff between placement refresh overhead and delivery savings.
Model turnover
Let be the steady-state popularity of file in a library that turns over at rate . If placement refreshes at rate and demand is i.i.d. , the cache always contains the most popular files up to a lag of .
Expected rate
where . The last term accounts for popular files that have changed but not yet been refreshed into caches.
Cost accounting
Placement traffic: per round. Delivery traffic: per round. Minimize the sum: Real CDNs run refresh windows of minutes to hours, matching this optimum for typical libraries.
Hybrid implications
For a coded-caching system, the refresh overhead is even higher — the subfiles must all be kept consistent. This is why production CDNs still use popularity caching: the operational complexity of coded refresh exceeds the delivery savings for high-churn libraries. The research question is: can we design a coded scheme whose subfiles are stable under library turnover? Chapter 20 (online coded caching) treats this.
ex-cc-ch01-15
ChallengeA fundamental question. Prove or disprove: for any demand distribution (not just uniform), the expected delivery rate of any (possibly coded) scheme with per-user cache is bounded below by , where is the concentration correction specific to distribution .
This is an open problem in coded caching — file it as a research direction.
The difficulty is that popularity structure can both help coded placement (by focusing caching on popular files) and hurt (by concentrating demand on a subset).
State of the art
This question is open in full generality. For two-class demand (popular + unpopular), characterization was obtained by Hachem et al. 2017. For general Zipf demand, only order-optimal schemes are known (Zhang–Pedarsani–Ji 2015, Ji–Tulino–Llorca–Caire 2017). See Chapter 13 for the state of the art.
Why this is hard
Coded caching's gain comes from multicasting across users with different demands. Concentrated popularity reduces the diversity of demands, narrowing the opportunity for coded gain. But concentrated popularity also makes uncoded caching better. The crossover is non-trivial and depends on jointly.
Research direction
Open problem: characterize exactly. Progress has been made in the two-extreme case ( uniform, delta) and scaling regimes. This is an excellent PhD project for a student who enjoys both combinatorics and optimization.