Nomographic Functions and Pre-Processing
Beyond the Sum
Section §16.2 computed . Many federated learning, distributed sensing, and consensus tasks want other aggregates: the arithmetic mean, the geometric mean, the max, the empirical variance. At first glance these are different problems — the wireless MAC only adds. Which functions can AirComp compute in one channel use?
The answer is all nomographic functions: those admitting the decomposition with pre-processing per user and post-processing at the receiver. The receiver gets a sum, then un-does the pre-processing. Kolmogorov-Arnold and Sprecher established that every continuous function of variables admits a nomographic representation with universal — so the AirComp class is, in principle, all continuous aggregates. The practical question is whether , are nice (e.g., Lipschitz, monotone) and whether the noise is amplified by . This section builds the catalog.
Definition: Nomographic Function
Nomographic Function
A function is nomographic if there exist pre-processing maps and a post-processing map such that
The wireless channel computes the inner sum via MAC superposition. Each user transmits (possibly with different transmit powers across users because has different dynamic range), the receiver recovers (via the analysis of §16.2), and applies .
Nomographic functions are AirComp's native computation class.
Example: A Catalog of Nomographic Aggregates
Express each of the following aggregates in nomographic form, identifying and .
(a) Arithmetic mean:
(b) Weighted mean: with known weights
(c) Geometric mean: for
(d) Empirical second moment:
(e) Maximum (approximate):
(a) Arithmetic mean
, . The receiver divides by — trivial.
(b) Weighted mean
, . Users pre-multiply by their weight; the MAC adds the weighted terms.
(c) Geometric mean
, . Users transmit ; MAC accumulates ; receiver exponentiates and divides exponent by .
(d) Second moment
, . Users send their squared values scaled by .
(e) Approximate maximum
For large , (smooth-max approximation). , . Higher gives a sharper max at the cost of numerical dynamic range — amplifies outliers.
Operational interpretation
AirComp's native aggregates cover all moments, weighted sums, and products (via logarithm). Combined with post-processing, this spans almost every FL/sensing aggregate. The non-nomographic operations — e.g., sorting, median — require multiple channel uses.
Theorem: Kolmogorov–Arnold Representation
Every continuous function admits a nomographic representation: there exist continuous functions and , and constants , such that In particular, decomposes as a finite sum of nomographic terms, each of which can be computed over the MAC in one channel use. The whole computation takes at most channel uses.
Kolmogorov 1957
Kolmogorov's original superposition theorem showed existence of the representation; the can be chosen universal (independent of ) and Hölder-continuous.
Sprecher's constructive form
Sprecher 1965 made the representation explicit: the universal functions are piecewise linear, computable, and Lipschitz; the absorb the specific .
Consequence for AirComp
Every continuous aggregate can be computed in AirComp rounds. For nomographic aggregates (one sum), a single round suffices. The theorem guarantees universal pre-processing — no need to re-design per function .
Practical caveat
Kolmogorov's are Hölder-continuous but not smooth. They amplify perturbations non-uniformly. Practical AirComp systems use smooth tailored to specific aggregates (mean, log, power) — trading universality for noise tolerance.
Theorem: Post-Processing Noise Amplification
Let the AirComp receiver estimate with MSE (Theorem 16.2.1), and let the final estimate be . Assume is differentiable at the true with derivative . For small noise (high SNR), the first-order MSE on is
Linearization of $\psi$
Write with . Taylor-expand: .
MSE
. The first-order term dominates for small .
Operational interpretation
Nomographic aggregates with steep amplify noise: the geometric mean uses ; its derivative at the true value is . For small source values, this derivative is small — noise is attenuated. For large values, the derivative is large — noise is amplified. Pre-processing choice matters.
Effective MSE for Common Nomographic Aggregates
Compare the end-to-end AirComp MSE for three nomographic aggregates: the arithmetic mean (), the geometric mean ( at the true value ), and the second moment (). Sweep transmit power; observe how the choice of pre/post-processing shapes the MSE.
Parameters
Designing in Practice
Practical nomographic AirComp design:
-
Match dynamic range. If varies by orders of magnitude across users, the weak-user bottleneck (§16.2) worsens. Center and scale to approximately match across users before applying power control.
-
Log pre-processing for multiplicative aggregates. Geometric mean, product, likelihood aggregation all use . Numerical care: clamp to avoid .
-
Choose to be Lipschitz at operating point. Smooth-max (, ) has — large when is small and is large. The approximation sharpens the max but amplifies noise. Balance empirically.
-
Universality vs. tailoring. Kolmogorov's universal works for every but is only Hölder-continuous — noise amplification is uncontrolled. Bespoke for specific aggregates (FL gradient averaging, for example) is nearly always better in practice.
-
Dither for debiasing. For non-linear , AirComp is biased (Jensen's gap). Small dither at the transmitters can reduce this bias at modest MSE cost.
- •
Dynamic range equalization across users
- •
Log pre-processing for multiplicative aggregates; clamp
- •
Choose Lipschitz at operating point
- •
Dither for bias reduction
Common Mistake: AirComp of Non-Linear Aggregates Is Biased
Mistake:
Treat the post-processed estimate as unbiased when is non-linear.
Correction:
For non-linear , (Jensen's second-order gap). The second term is the bias. For quadratic (e.g., variance estimation) the bias is exactly and does not vanish as grows. Only linear post-processing ( affine) produces strictly unbiased AirComp. Design around this: (i) dither with known statistics and debias analytically; (ii) use linear when possible; (iii) accept a small bias in exchange for noise-efficient non-linear AirComp.
Key Takeaway
AirComp natively computes any nomographic function in one channel use. This spans means, weighted sums, geometric means, second moments, and smooth maxima. Kolmogorov-Arnold guarantees every continuous aggregate has a nomographic representation (possibly multi-channel-use). Noise is amplified by at the operating point; non-linear introduces an bias. The function class is wide; the engineering is in matching pre-processing to aggregate.
Historical Note: From Kolmogorov's 13th Problem to the Wireless MAC
Kolmogorov's 1957 theorem on function superposition — originally a resolution of Hilbert's 13th problem on solving polynomial equations via compositions of continuous functions — lay dormant in pure math for four decades. Nazer and Gastpar 2007 noticed that the wireless MAC's natural summation makes it a physical computer for the nomographic form, coining "computation over multiple-access channels." Goldenbaum 2013 made the connection explicit for engineering practice: every continuous FL aggregate decomposes into at most nomographic terms, each computable in one AirComp channel use. The bridge between classical mathematics and wireless systems turned a 55-year-old existence theorem into a practical aggregation framework.
Quick Check
Which of the following aggregates can AirComp compute in a single channel use?
The median of .
The empirical variance .
The geometric mean , .
Sorting into increasing order.
, . One MAC use suffices.