Ferkans — Interactive Telecom Tutor

ex-ch10-01

Easy

State the threat model for the Bonawitz secure-aggregation protocol. Specify the adversary, what it observes, and what it must not learn.

Solution

Adversary

The server (honest-but-curious) plus up to $T$ colluding users. They follow the protocol but pool all observed messages and inputs.

Observations

Server: all uploads + protocol-spec messages. Colluders: their own gradients, random coins, and received messages.

Must not learn

Anything about $\{\mathbf{g}_k : k \notin \mathcal{U}\}$ beyond what is implied by the aggregate $\mathbf{G} = \sum_k \mathbf{g}_k$ . Information-theoretic equality: $I(\{\mathbf{g}_k\}_{k \notin \mathcal{U}}; \text{view}) = I(\{\mathbf{g}_k\}; \mathbf{G})$ .

ex-ch10-02

Easy

Why do pairwise masks $\mathbf{r}_{ij}$ cancel in the server's aggregate?

Solution

Antisymmetry

Convention: $\mathbf{r}_{ji} = -\mathbf{r}_{ij}$ . For each pair $(i, j)$ , user $i$ adds $+\mathbf{r}_{ij}$ and user $j$ adds $-\mathbf{r}_{ij}$ . The pair's contribution is zero.

Aggregate

Summing over all users: $\sum_k \tilde{\mathbf{g}}_k = \sum_k \mathbf{g}_k + \sum_k \sum_{j \neq k} \mathbf{r}_{kj} = \sum_k \mathbf{g}_k + \sum_{i<j} (\mathbf{r}_{ij} + \mathbf{r}_{ji}) = \mathbf{G}$ . All pairs cancel. ✓

ex-ch10-03

Easy

For the Bonawitz protocol with $n = 50$ users, compute the per-round number of pairwise mask exchanges and the per-user upload size for $d = 10^7$ at 32-bit precision.

Solution

Pairwise exchanges

$\binom{n}{2} = \binom{50}{2} = 1225$ pairs. Each requires one DH exchange.

Per-user upload

$d \cdot 32 = 3.2 \cdot 10^8$ bits $= 40$ MB per user.

Total per-round

Aggregate uplink: $50 \cdot 40$ MB = 2 GB. Plus 1225 DH messages of $\sim 100$ bits each: negligible compared to the gradient uplink.

ex-ch10-04

Easy

State the feasibility constraint for the Bonawitz dropout-resilient protocol with collusion threshold $T$ and dropout rate $\delta$ on $n$ users.

Solution

Constraint

$T + \delta n \leq n - 1$ , equivalently $T + 1 + \delta n \leq n$ . The Shamir threshold $t$ must satisfy $T + 1 \leq t \leq n - \delta n$ .

Interpretation

Strict colluder threshold ( $T$ ) plus tolerated dropouts ( $\delta n$ ) cannot together exceed the user pool minus one. This is a structural feasibility limit, not adjustable by clever engineering.

ex-ch10-05

Medium

Why is each user's upload $\tilde{\mathbf{g}}_k$ uniform over $\mathbb{F}_q^d$ to the server, given that pairwise seeds are independent?

Show Hint

Each upload is a sum of one gradient and $n-1$ independent uniform random vectors.

Solution

Sum of uniform random variables

$\tilde{\mathbf{g}}_k = \mathbf{g}_k + \sum_{j \neq k} \pm \mathbf{r}_{kj}$ . The server has no access to any $\mathbf{r}_{kj}$ (those are shared between $k$ and $j$ only). The sum of $n - 1$ independent uniform random vectors over $\mathbb{F}_q^d$ is itself uniform.

Conclusion

Adding $\mathbf{g}_k$ to a uniform random vector gives a uniform random vector. Hence $\tilde{\mathbf{g}}_k$ looks uniform to the server, revealing nothing about $\mathbf{g}_k$ in the marginal distribution.

Joint constraint

The joint distribution of all uploads is constrained by $\sum_k \tilde{\mathbf{g}}_k = \mathbf{G}$ — the aggregate is the only piece of information that leaks.

ex-ch10-06

Medium

Explain why the Bonawitz protocol needs both pairwise seeds and self-mask seeds (for dropout handling).

Show Hint

Consider what happens if a user does not drop and the server reconstructs its pairwise seeds.

Solution

Without self-masks

If a user $k$ does not drop and the server reconstructs all of $k$ 's pairwise seeds (to cancel masks of dropped users), the server can compute $\sum_{j \in \mathcal{D}} \mathbf{r}_{kj}$ — the sum of $k$ 's mask contributions to dropped users. Subtracting this from the server's aggregated sum reveals additional information about $\mathbf{g}_k$ .

With self-masks

Each user adds an extra self-mask $\mathbf{m}_k$ (Shamir-shared like the pairwise seeds). Even if all of $k$ 's pairwise seeds are reconstructed, $\mathbf{m}_k$ remains private (under the threshold). Hence $\mathbf{g}_k$ stays masked.

Symmetry

The self-mask is reconstructed by the server only if user $k$ also drops out (other surviving users provide their shares of $b_k$ ). For non-dropped $k$ , $\mathbf{m}_k$ remains private and protects $\mathbf{g}_k$ .

ex-ch10-07

Medium

For $n = 200$ users, $T = 30$ collusion threshold, and $\delta = 0.25$ dropout rate, find a valid Shamir threshold $t$ .

Show Hint

$T + 1 \leq t \leq n - \delta n$ .

Solution

Compute bounds

Lower: $T + 1 = 31$ . Upper: $n - \delta n = 200 - 50 = 150$ .

Choose $t$

Any $t \in [31, 150]$ is valid. Production choice: $t = \lceil 2n/3 \rceil = 134$ — safe margin against further dropouts and collusion beyond design assumptions.

ex-ch10-08

Medium

State the Caire et al. (2022) optimality theorem precisely and identify its scheme class.

Show Hint

Within the uncoded-groupwise-key class.

Solution

Statement

For any secure-aggregation scheme with uncoded groupwise keys, tolerating $T$ colluders and $\delta n$ dropouts, the per-round communication is $\geq nd + \binom{n - \delta n}{2} / (n - \delta n - T)$ . For $\delta = 0$ and $T = O(1)$ : $R \geq nd + \Omega(n^2)$ .

Scheme class

Uncoded groupwise keys: each user's upload is a linear combination of subsetwise random keys (no coding of gradients). Pairwise masking (Bonawitz) is the canonical instance.

Achievability

Bonawitz's pairwise-masking protocol achieves the bound. Within the uncoded class, $O(n^2)$ is tight.

ex-ch10-09

Medium

Why does the Caire et al. (2022) optimality result not apply to CCESA (Chapter 12)?

Show Hint

CCESA uses sparse random graphs, not complete pairwise.

Solution

CCESA uses sparse random graph

CCESA's mask structure is a sparse random graph (Erdős–Rényi), not a complete pairwise graph. The masking still works because the pairwise cancellations sum out probabilistically, but the construction is coded: the masks are chosen to cancel under specific subset conditions.

Outside the class

The Caire et al. theorem is restricted to protocols where keys are unmodified (uncoded). CCESA's mask structure is coded via the random graph topology, putting it outside the class.

Achieved cost

CCESA achieves $O(n\sqrt{n/\log n})$ — strictly below the Caire et al. lower bound for the uncoded class. The CCESA paper (Chapter 12) provides the matching converse for the sparse-graph class.

ex-ch10-10

Medium

Compose the Bonawitz protocol with $b$ -bit gradient quantization (Chapter 9 §9.3). What is the per-round communication cost?

Solution

Composition

Each user quantizes its gradient to $b$ bits per scalar before applying pairwise masks. The masks are over the same field, so the quantization is compatible with the protocol.

Per-round cost

Per-user upload: $b \cdot d$ bits (gradient) + $O(n)$ bits (DH messages). Aggregate: $n \cdot b \cdot d + O(n^2)$ bits.

Comparison

At $b = 8$ vs. $b = 32$ : $4\times$ savings on gradient component. The $O(n^2)$ key overhead is unchanged. For $n = 100, d = 10^7, b = 8$ : $8 \cdot 10^9 + 10^4 = 8 \cdot 10^9$ — gradient dominates. Quantization helps when $nd >> n^2$ .

ex-ch10-11

Hard

Prove (sketch) that Bonawitz's pairwise-masking protocol satisfies the privacy guarantee $I (\mathbf{g}_k; \text{server view}) = I (\mathbf{g}_k; \mathbf{G})$ in the ideal information-theoretic model (uniform random seeds).

Show Hint

Use the joint distribution argument.

Solution

Setup

Server sees $\{\tilde{\mathbf{g}}_k\}_{k=1}^n$ with $\tilde{\mathbf{g}}_k = \mathbf{g}_k + \sum_j \pm \mathbf{r}_{kj}$ . Pairwise seeds $\{\mathbf{r}_{ij}\}$ are uniform random over $\mathbb{F}_q^d$ , independent.

Conditional distribution

Given the aggregate constraint $\sum_k \tilde{\mathbf{g}}_k = \mathbf{G}$ , the joint distribution of $(\tilde{\mathbf{g}}_1, \ldots, \tilde{\mathbf{g}}_n)$ is uniform over $\{(\mathbf{x}_1, \ldots, \mathbf{x}_n) : \sum_k \mathbf{x}_k = \mathbf{G}\}$ .

Information-theoretic equality

Conditional on $\mathbf{G}$ , the server's view is uniform — independent of which specific $(\mathbf{g}_1, \ldots, \mathbf{g}_n)$ summed to $\mathbf{G}$ . By the chain rule, $I(\mathbf{g}_k; \text{view}) = I(\mathbf{g}_k; \mathbf{G})$ . The view adds nothing beyond the aggregate. $\blacksquare$

ex-ch10-12

Hard

Sketch the proof of Caire et al.'s optimality theorem for $T = 0$ (no collusion) and $\delta = 0$ (no dropouts). What is the lower bound and how does it arise?

Show Hint

Cut-set / DOF argument over the key-sharing structure.

Solution

Cut-set setup

Consider the cut between any single user and the rest of the system. The user's upload must contribute its gradient to the aggregate, but the aggregate must reveal nothing else.

Key-counting

The server's view consists of $n$ uploads. For privacy, each user's upload must be masked by sufficient unique keys (in the uncoded class, this means subsetwise keys). Counting the constraints gives at least $\binom{n}{2}$ distinct keys for $T = 0, \delta = 0$ .

Aggregate cost

$\binom{n}{2} = O(n^2)$ keys — each contributing at least $O(1)$ bits to the protocol's communication. Total: $O(n^2)$ key-related cost, plus $O(nd)$ gradient cost, giving the bound $R \geq nd + \Omega(n^2)$ . Bonawitz matches this with equality. $\blacksquare$

ex-ch10-13

Hard

Discuss the relationship between Bonawitz's secure-aggregation protocol and Maddah-Ali / Niesen coded caching (Chapter 4 §4.3). Identify the structural parallels.

Show Hint

Both involve subset-keyed broadcasts/uploads with cancellation properties.

Solution

Parallel

Both protocols use subset-indexed keys: in coded caching, each subset of users has a common cached chunk; in secure aggregation, each pair (or subset) shares a common mask. Both use the cancellation / constraint structure of the keys to achieve the protocol's goal (deliver content / aggregate gradient).

Difference

Coded caching delivers requests across subsets (one broadcast satisfies many users); secure aggregation conceals individual contributions (sum across subsets cancels masks). The algebraic mechanics are dual: caching uses XOR-aligned broadcasts; SecAgg uses subset-cancelled masks.

Analytical tools

Both fields use similar cut-set / IA arguments. The Caire et al. optimality proof (§10.4) reuses the Maddah-Ali / Niesen cut-set machinery. This algebraic kinship is one of the unifying themes of Part II.

ex-ch10-14

Hard

Compose Bonawitz's secure aggregation with $\epsilon$ - differential privacy (DP). What is the resulting privacy guarantee?

Solution

Composition

Each user adds Gaussian noise $\mathcal{N}(0, \sigma^2)$ to its gradient before applying Bonawitz masks. The server receives masked noisy gradients; aggregating gives $\mathbf{G} + \mathcal{N}(0, n \sigma^2)$ — noisy aggregate.

Privacy guarantee

Information-theoretic (from Bonawitz): server learns only the (noisy) aggregate, not individual gradients.
Differential (from DP noise): the noisy aggregate satisfies $\epsilon$ -DP for some $\epsilon$ depending on $\sigma$ and the $L_2$ sensitivity of gradients.

Combined strength

Stronger than either alone: even if the information-theoretic guarantee were broken (e.g., by a powerful adversary), the DP noise bounds the aggregate-level leakage. Production FL deployments commonly use this composition.

ex-ch10-15

Challenge

Open problem. Bonawitz's protocol is information-theoretically optimal within uncoded groupwise keys (§10.4). CCESA (Chapter 12) achieves $O(n\sqrt{n/\log n})$ via sparse random graphs, outside this class. Is there a deterministic sparse-key construction that achieves $o(n^2)$ communication while preserving Bonawitz-style information-theoretic privacy?

Show Hint

Combinatorial designs (Steiner systems, expander graphs) might help.

Solution

Status

The CCESA construction (random sparse graph) gives the privacy guarantee with high probability — not deterministically. The natural question is whether deterministic sparse constructions can match this.

Candidate

Steiner systems, expander graphs, and other combinatorial structures have the right "any subset of $T$ users misses some key" property. Constructing such graphs with appropriate density would give a deterministic CCESA-like protocol.

Status (open)

No known explicit construction matches CCESA's $O(n\sqrt{n/\log n})$ bound deterministically. Some partial results exist (Kadhe et al. 2020 FastSecAgg) but with weaker scaling. The full characterization is open. Research-grade problem at the intersection of combinatorics and cryptography.

Exercises

ex-ch10-01

Adversary

Observations

Must not learn

ex-ch10-02

Antisymmetry

Aggregate

ex-ch10-03

Pairwise exchanges

Per-user upload

Total per-round

ex-ch10-04

Constraint

Interpretation

ex-ch10-05

Sum of uniform random variables

Conclusion

Joint constraint

ex-ch10-06

Without self-masks

With self-masks

Symmetry

ex-ch10-07

Compute bounds

Choose $t$

ex-ch10-08

Statement

Scheme class

Achievability

ex-ch10-09

CCESA uses sparse random graph

Outside the class

Achieved cost

ex-ch10-10

Composition

Per-round cost

Comparison

ex-ch10-11

Setup

Conditional distribution

Information-theoretic equality

ex-ch10-12

Cut-set setup

Key-counting

Aggregate cost

ex-ch10-13

Parallel

Difference

Analytical tools

ex-ch10-14

Composition

Privacy guarantee

Combined strength

ex-ch10-15

Status

Candidate

Status (open)