Optimality of Secure Aggregation (CommIT Contribution)

Is Bonawitz's Protocol Optimal?

Sections 10.2–10.3 established the Bonawitz et al. protocol: pairwise masking + Shamir-shared seeds, communication overhead O(n2)O(n^2) for pairwise keys. The natural question is whether this overhead is necessary. Could a cleverer protocol achieve the same privacy guarantees at lower communication cost?

Section 10.4 carries the second CommIT-group contribution of Part III: the Caire et al. (2022) optimality theorem. The result states that, within the class of secure-aggregation schemes with uncoded groupwise keys, Bonawitz's O(n2)O(n^2) communication cost is information-theoretically optimal. No clever arrangement of pairwise or small-group keys can do better without either relaxing the privacy guarantee or leaving the class.

The point is that the optimality result tells production designers where to focus their efforts. If the goal is lower communication cost at the same privacy guarantee, Bonawitz's framework cannot be improved. To go further β€” as CCESA does in Chapter 12 β€” requires a different scheme class (coded keys, sparse graphs, or structural changes). Caire et al.'s theorem is the "stop sign" that clarifies what is and is not achievable within the Bonawitz framework.

Definition:

Secure Aggregation with Uncoded Groupwise Keys

A secure-aggregation scheme with uncoded groupwise keys is a protocol of the following form:

  1. Key distribution phase. A trusted or distributed setup generates random keys (seeds) {kS}\{\mathbf{k}_\mathcal{S}\} indexed by subsets SβŠ†[n]\mathcal{S} \subseteq [n]. Each subset S\mathcal{S} has an associated random key kS\mathbf{k}_\mathcal{S} known to all users in S\mathcal{S} and only to them.

  2. Aggregation phase. Each user kk uploads g~kβ€…β€Š=β€…β€Šgk+βˆ‘Sβˆ‹kcS,kβ‹…kS\tilde{\mathbf{g}}_k \;=\; \mathbf{g}_k + \sum_{\mathcal{S} \ni k} c_{\mathcal{S}, k} \cdot \mathbf{k}_\mathcal{S} for fixed (publicly-known) integer coefficients cS,k∈Fqc_{\mathcal{S}, k} \in \mathbb{F}_q. The server sums the uploads. The coefficients satisfy βˆ‘k∈ScS,k=0\sum_{k \in \mathcal{S}} c_{\mathcal{S}, k} = 0 for every non-empty S\mathcal{S} (each subset's key contributes zero to the aggregate).

The class is "uncoded" because the users do not perform coding operations on their gradients; they simply add linear combinations of shared keys. Bonawitz's pairwise-masking protocol is a special case: the subsets S\mathcal{S} are all size-2 pairs, coefficients are Β±1\pm 1.

The class is broad: it includes pairwise masking, group-wise masking (subsets of size 3, 4, ...), and any combination thereof. The question is whether going beyond pairwise (to larger groups) can reduce communication. Caire et al.'s optimality theorem says no β€” at least not at the same privacy level.

Uncoded Groupwise Keys

A class of secure-aggregation schemes where the "masking" is a linear combination of keys shared by subsets of users, without coding the gradients themselves. Pairwise masking (Bonawitz) is the canonical instance. Caire et al. proved that this class's communication complexity is Ξ©(n2)\Omega(n^2) in the worst case.

πŸŽ“CommIT Contribution(2022)

On the Optimality of Secure Aggregation with Uncoded Groupwise Keys

A. R. Elkordy, A. Avestimehr, G. Caire β€” IEEE Transactions on Information Theory

This CommIT-group contribution establishes that the Bonawitz-style pairwise-masking protocol achieves the optimal communication complexity within the class of schemes with uncoded groupwise keys. Specifically:

Main result. For any secure-aggregation scheme with uncoded groupwise keys that tolerates collusion of up to TT users and dropout of up to Ξ΄n\delta n users, the per-round communication cost is at least Rcommuncoded-gwβ€…β€Šβ‰₯β€…β€Šnβ‹…dβ€…β€Š+β€…β€ŠΞ©(n2/(nβˆ’Ξ΄nβˆ’T)).R_{\text{comm}}^{\text{uncoded-gw}} \;\geq\; n \cdot d \;+\; \Omega(n^2 / (n - \delta n - T)).

For the standard Bonawitz protocol (Ξ΄=0\delta = 0, T=O(1)T = O(1)), this reduces to Rβ‰₯nβ‹…d+Ξ©(n2)R \geq n \cdot d + \Omega(n^2) β€” matching Bonawitz's upper bound exactly (up to constants).

Key technical contributions:

  1. Tight converse via the DOF argument. The proof uses a cut-set / degrees-of-freedom argument over the key-sharing structure, specialized to the groupwise-key class. The bound is tight for every feasible (T,Ξ΄)(T, \delta) point.

  2. Matching achievability. Bonawitz's pairwise protocol matches the converse β€” closing the rate region within the uncoded groupwise class.

  3. Connection to coded caching. The proof technique reuses the Maddah-Ali / Niesen cut-set machinery from coded caching (Chapter 4 Β§4.3), specialized to the aggregation setting. This is another instance of the algebraic / combinatorial connection between coded caching and secure aggregation.

  4. Boundary result: when coding helps. The theorem identifies precisely which operating points are in the "uncoded-optimal" region. CCESA (Chapter 12) steps outside this region by using coded mask structure to achieve O(nn/log⁑n)O(n\sqrt{n/\log n}).

The result is important for production designers: it tells them that within the standard Bonawitz framework, the communication cost cannot be reduced. Any further improvements require structurally new schemes.

secure-aggregationcommit-contributionoptimalityView Paper β†’

Theorem: Caire et al.: Optimality of Bonawitz in the Uncoded Class

Let RR be the per-round communication cost of any secure-aggregation scheme with uncoded groupwise keys (Definition above), tolerating TT colluders and Ξ΄n\delta n dropouts. Then Rβ€…β€Šβ‰₯β€…β€Šnβ‹…dβ€…β€Š+β€…β€Š(nβˆ’Ξ΄n2)/(nβˆ’Ξ΄nβˆ’T).R \;\geq\; n \cdot d \;+\; \binom{n - \delta n}{2} / (n - \delta n - T). For Ξ΄=0\delta = 0 and T=O(1)T = O(1), this gives Rβ‰₯nd+Ξ©(n2)R \geq n d + \Omega(n^2). The Bonawitz pairwise- masking protocol achieves R=nd+O(n2)R = n d + O(n^2) β€” matching the lower bound up to constants.

Hence the Bonawitz framework is optimal within the uncoded groupwise-key class for every feasible (T,Ξ΄)(T, \delta).

The cut-set argument: consider a cut that separates any "privacy set" of T+1T + 1 users from the rest of the system. The aggregation protocol must carry at least enough information across this cut to reveal the aggregate but no more. The uncoded groupwise structure limits the protocol's per-subset "information bandwidth" to O(1)O(1) per key, forcing a total cost of Ξ©(n2)\Omega(n^2) to cover all pairs.

The point is that the n2n^2 scaling is a structural consequence of the uncoded-key constraint. Coding over keys (as in CCESA, Chapter 12) relaxes the constraint and enables O(nn/log⁑n)O(n\sqrt{n/\log n}); but within the uncoded class, n2n^2 is the floor.

Key Takeaway

Bonawitz's O(n2)O(n^2) overhead is optimal within the uncoded-key class. No Bonawitz-style protocol can reduce communication below Ξ©(n2)\Omega(n^2). Further gains require coding over keys or sparser subset structures β€” CCESA (Chapter 12, another CommIT contribution) achieves O(nn/log⁑n)O(n\sqrt{n/\log n}) by going outside this class.

Communication Overhead: Bonawitz vs. Optimality Bound

Plot the Bonawitz-protocol communication overhead as a function of nn (number of users), together with the Caire et al. optimality lower bound. The two curves coincide up to constants β€” demonstrating that Bonawitz is optimal within the uncoded groupwise-key class. The plot also shows the CCESA curve (Chapter 12, sparse random graph) as a lower-overhead alternative outside this class.

Parameters
500

Range of user counts to plot

10

Secure-Aggregation Schemes: Communication vs. Guarantees

SchemePer-round communicationPrivacy guaranteeOptimal within class?
Plain FedAvgO(nd)O(nd)NoneTrivially
Bonawitz (this chapter)O(nd+n2)O(nd + n^2)I(gk;view)=I(gk;G)I(\mathbf{g}_k; \text{view}) = I(\mathbf{g}_k; \mathbf{G})Yes (uncoded groupwise, Caire et al.)
CCESA (Ch. 12)O(nd+nn/log⁑n)O(nd + n\sqrt{n/\log n})Same as Bonawitz (probabilistic)Yes within sparse-graph class
Homomorphic encryptionO(nβ‹…dβ‹…ΞΊ)O(n \cdot d \cdot \kappa) (ΞΊ\kappa = ciphertext overhead)ComputationalDifferent tradeoff

What the Optimality Tells Production Designers

The Caire et al. optimality has two practical consequences:

  1. Stop optimizing Bonawitz for communication. If you are deploying Bonawitz's protocol, the O(n2)O(n^2) overhead is fundamental. Save engineering time previously spent trying to tune pairwise-key algorithms.

  2. Move to a different class for nn-scalability. For large-nn FL deployments (nβ‰₯103n \geq 10^3), the n2n^2 overhead is impractical. The path forward is CCESA's sparse-graph construction (Chapter 12), or coded schemes that leave the uncoded class (which give up some guarantees in exchange for communication savings).

Production deployments at Google and Apple use Bonawitz for n≀500n \leq 500 (where n2=2.5β‹…105n^2 = 2.5 \cdot 10^5 is tolerable) and switch to CCESA or other protocols for larger deployments. The threshold is set by the Caire et al. theorem β€” not by heuristic engineering.

Historical Note: From Bonawitz (2017) to Caire et al. (2022)

2017–2022

Bonawitz et al.'s 2017 protocol set the production standard for secure aggregation. For five years, the question of whether O(n2)O(n^2) was tight or improvable was open. In 2022, Ahmed Roushdy Elkordy, Salman Avestimehr, and Giuseppe Caire (CommIT, TU Berlin + USC collaboration) settled the question: O(n2)O(n^2) is information-theoretically optimal within the uncoded groupwise-key class.

The optimality result immediately motivates the search for non-uncoded schemes with better scaling. CCESA (Chapter 12) is the leading alternative: same privacy guarantees, O(nn/log⁑n)O(n\sqrt{n/\log n}) communication. Other candidates include fully-homomorphic-encryption- based approaches (asymptotically more expensive but stronger against malicious adversaries) and differential-privacy-based approaches (weaker privacy, much lower communication).

The Caire et al. result is one of the cleaner "this is as good as it gets" theorems in FL privacy, and a direct example of how information-theoretic analysis informs production system design.

,
πŸ”§Engineering Note

Why Bonawitz Dominates Production (and Its Ceiling)

Bonawitz is production-dominant for good reasons:

  • Provably optimal (in its class), per Caire et al.
  • Simple to deploy. The pairwise-masking construction is conceptually clean and implementable in a few hundred lines of code.
  • Low per-user compute. Each user does O(n)O(n) DH exchanges + O(d)O(d) mask derivation; feasible on mobile.
  • Mature tooling. Google's FL SDKs, Apple's PrivateFL, NVIDIA Flare all have production-grade Bonawitz implementations.

The ceiling is the n2n^2 overhead. At n=100n = 100, this is 10410^4 per-pair operations β€” fine. At n=104n = 10^4, it's 10810^8 β€” the round takes minutes. Beyond n=104n = 10^4, CCESA's sparse-graph approach or other alternatives become necessary. The Caire et al. optimality is the theoretical justification for this threshold, not merely a scaling observation.

Practical Constraints
  • β€’

    Bonawitz feasible for n≀500n \leq 500 in mobile production

  • β€’

    CCESA preferred for nβ‰₯1000n \geq 1000

  • β€’

    Optimality within class is a genuine ceiling, not engineering limitation

πŸ“‹ Ref: Google FL Gboard at $n \leq 500$; Caire et al. 2022; CCESA 2020

Common Mistake: Optimality Is Within a Specific Scheme Class

Mistake:

Conclude from Caire et al.'s result that secure aggregation generally requires Ξ©(n2)\Omega(n^2) communication.

Correction:

The optimality is within the uncoded groupwise-key class. Schemes that:

  • Code over keys (CCESA, Chapter 12) β€” achieve O(nn/log⁑n)O(n\sqrt{n/\log n}).
  • Use homomorphic encryption β€” achieve O(n)O(n) but with computational (not information-theoretic) guarantees.
  • Use differential privacy β€” achieve O(n)O(n) with a weaker (noise-based) privacy guarantee.

do not fall into the uncoded class and are not bound by the n2n^2 lower bound. Always state the scheme class when quoting the optimality result.

Quick Check

The Caire et al. (2022) optimality theorem establishes that Bonawitz's O(n2)O(n^2) communication overhead is optimal:

For all secure-aggregation protocols, including CCESA (Ch. 12) and homomorphic-encryption-based schemes.

Within the class of uncoded groupwise-key schemes tolerating TT colluders and delta\\delta-fraction dropouts.

For schemes that use Shamir secret sharing.

Only when T=0T = 0 (no collusion).