Centralized vs. Distributed Processing
The Central Question of Cell-Free Processing
Chapters 11 and 12 established that cell-free massive MIMO eliminates cell boundaries and that user-centric clustering makes the architecture scalable. But we left a fundamental question unanswered: how should the distributed APs process the received signals? At one extreme, every AP forwards its raw baseband samples to the CPU, which performs centralized MMSE combining โ optimal, but demanding enormous fronthaul capacity. At the other extreme, each AP applies local combining and forwards only a scalar estimate per user โ minimal fronthaul, but suboptimal performance. This chapter develops the full spectrum between these extremes and identifies when each operating point is appropriate.
Definition: Centralized Processing
Centralized Processing
Consider a cell-free network with APs, each equipped with antennas, serving single-antenna users. Let denote the received signal at AP . In centralized processing, the CPU collects and forms the network-wide received vector
The CPU then applies a centralized combining vector to detect user :
The centralized MMSE combining vector is
where is the stacked channel estimate, and accounts for estimation error covariance.
Centralized MMSE is optimal in the sense that it maximizes the per-user SINR under the UatF framework. The price is that every AP must forward complex samples per channel use to the CPU.
Definition: Distributed Processing
Distributed Processing
In distributed processing, each AP applies a local combining vector to its own received signal:
AP then forwards the scalar (one complex number per user) to the CPU. The CPU forms the final estimate by linearly combining the local estimates:
where are weighting coefficients and is the set of APs serving user .
Distributed processing reduces the fronthaul load from complex samples to complex scalars per user per channel use. The question is how much SINR we lose.
Centralized Processing
A cell-free processing architecture where all APs forward raw received signals (or sufficient statistics) to a central processing unit, which applies network-wide combining. Achieves the best SINR but requires the highest fronthaul capacity.
Related: Distributed Processing, Fronthaul
Distributed Processing
A cell-free processing architecture where each AP applies local combining and forwards only scalar estimates to the CPU. Minimizes fronthaul load at the cost of suboptimal interference suppression.
Related: Centralized Processing, Local Combining
Theorem: Centralized MMSE SINR (UatF Bound)
Under centralized MMSE combining with the UatF bound, the uplink SINR of user is
where is the block-diagonal estimation error covariance, , and depends on the pilot scheme.
The centralized MMSE receiver sees the entire -dimensional signal space and can optimally balance desired signal amplification against interference suppression across all APs simultaneously. This is the same MMSE receiver as in co-located massive MIMO, but now the antenna elements are distributed across the coverage area.
Network-wide signal model
Stack the received signals: where .
Decompose via UatF
Write where is the MMSE estimate and is the estimation error, uncorrelated with . The UatF bound treats as the true channel and as additional noise.
Apply MMSE combining
The MMSE combining vector maximizes the SINR under the effective noise model:
Compute the SINR
Substituting the MMSE combiner into the SINR expression and applying the matrix inversion lemma yields the stated result. The key identity is where excludes user 's desired signal term.
Theorem: Distributed Processing SINR with Weighted Combining
Under distributed processing with local combining vectors and CPU weights , the UatF SINR of user is
where the expectation is over small-scale fading. The denominator separates into beamforming gain uncertainty, inter-user interference, and noise amplification.
Each AP provides a noisy estimate of user 's symbol. The quality of these estimates varies across APs due to different path losses and interference environments. The CPU weights should emphasize APs with strong, reliable estimates and de-emphasize APs with weak or interference-dominated signals.
Local estimate statistics
The local estimate at AP is . Taking the conditional expectation given the channel estimates: where we used the UatF approach of treating as deterministic.
SINR derivation
The CPU forms . The desired signal power is . The interference-plus-noise power is the variance of the remaining terms, which decomposes AP-by-AP because the noise across APs is independent.
Conclude
Collecting terms yields the stated SINR. The block-diagonal structure of the interference covariance (across APs) is what makes distributed processing tractable โ the interference terms factorize.
Centralized vs. Distributed Processing
| Aspect | Centralized (Level 4) | Distributed (Level 1โ3) |
|---|---|---|
| Fronthaul per AP | complex samples per channel use | complex scalars per channel use |
| CPU computation | for MMSE | for weighted sum |
| Interference suppression | Network-wide: suppresses all inter-user interference jointly | Local: suppresses only intra-AP interference |
| CSI requirement at CPU | for all | Only large-scale statistics (for LSFD weights) |
| Scalability | Poor: matrix inversion scales with | Good: per-AP computation bounded |
| Performance | Optimal (MMSE bound) | Depends on cooperation level (L1 < L2 < L3 < L4) |
Example: Fronthaul Load: Centralized vs. Distributed
Consider a cell-free network with APs, each with antennas, serving users. Each AP serves a user-centric cluster of size users on average. Compare the fronthaul load per coherence block of samples for centralized and distributed processing. Assume 32-bit floating point for real and imaginary parts (64 bits per complex sample).
Centralized fronthaul
Each AP forwards its full received signal: complex samples per channel use. Per coherence block: complex samples per AP. Total across all APs: complex samples. In bits: Mbit per coherence block.
Distributed fronthaul
Each AP forwards one scalar estimate per served user: scalars per channel use. Per coherence block: complex samples per AP. Total: complex samples.
Wait โ this is more than centralized? The issue is that we count per-user scalars for all channel uses. In practice, the distributed approach forwards only the data portion ( samples), and the scalar estimates can be quantized with fewer bits. With pilot samples and 16-bit quantization: Distributed: bits per AP. Centralized: bits per AP.
The real comparison
The fronthaul advantage of distributed processing becomes decisive when is large (many antennas per AP) or when aggressive quantization is used. For antennas: Centralized: bits per AP. Distributed (16-bit): bits per AP. The ratio grows linearly with : distributed processing saves a factor of roughly in fronthaul when .
Centralized vs. Local MMSE: Per-User SINR CDF
Compare the cumulative distribution of per-user SINR under centralized MMSE (Level 4) and local MMSE (Level 2) combining. Adjust the number of APs and antennas per AP to observe how the performance gap changes with network density.
Parameters
Number of access points
Antennas per access point
Number of users
User-centric cluster size
Common Mistake: Centralized Processing Is Not Always Worth the Cost
Mistake:
Assuming that centralized MMSE (Level 4) is always the right choice because it maximizes SINR.
Correction:
The SINR gain from centralized processing shrinks as the number of antennas per AP increases. With antennas, local MMSE at each AP already suppresses most intra-cluster interference. The remaining gap to centralized MMSE may not justify the increase in fronthaul load. The system designer must evaluate the performance-fronthaul tradeoff for the specific deployment scenario.
Historical Note: From Cloud-RAN to Cell-Free: The Distributed Processing Journey
2010โ2020The idea of centralizing baseband processing appeared in the Cloud-RAN (C-RAN) architecture proposed by China Mobile Research Institute around 2010. In C-RAN, remote radio heads (RRHs) forward digitized baseband signals to a centralized baseband unit (BBU) pool via high-capacity fronthaul links (typically CPRI over fiber). The cell-free massive MIMO paradigm, introduced by Ngo et al. in 2017, can be viewed as a Cloud-RAN where the RRHs are single-antenna (or few-antenna) APs deployed at very high density. The evolution from Cloud-RAN to cell-free to user-centric cell-free mirrors the engineering community's gradual recognition that full centralization does not scale โ the question has always been: how much to centralize? The four cooperation levels formalized by Bjornson and Sanguinetti in 2020 provide the definitive answer to this question.
Quick Check
In a cell-free network with APs, antennas per AP, and users, what is the per-AP fronthaul dimension for centralized processing (complex samples per channel use)?
(one per user)
(one per antenna)
()
(full dimension)
Correct. In centralized processing, AP forwards its entire received vector , so the fronthaul carries 4 complex samples per channel use.
Why This Matters: O-RAN Functional Splits and Cooperation Levels
The cooperation levels map directly to O-RAN functional split options. Level 4 (centralized MMSE) corresponds to Split 7.2x where the O-RU forwards frequency-domain IQ samples โ the analog of our . Level 2 (local MMSE) corresponds to a higher split (e.g., Split 6) where the O-RU performs local equalization and forwards soft bits or symbol estimates. The O-RAN Alliance's ongoing work on cell-free RAN explicitly addresses these tradeoffs. In practice, the fronthaul technology (fiber, millimeter-wave wireless, or Ethernet) determines which split is feasible, and therefore which cooperation level is achievable.
Key Takeaway
Centralized vs. distributed processing is not a binary choice. The four cooperation levels (L1โL4) provide a continuum from fully local to fully centralized processing. The optimal operating point depends on the fronthaul capacity, the number of antennas per AP, and the interference environment. As a rule of thumb: invest in centralized processing when APs have few antennas (โ) and the network is interference-limited; use distributed processing when APs are well-equipped () and fronthaul is the bottleneck.