Ferkans — Interactive Telecom Tutor

The Central Question of Cell-Free Processing

Chapters 11 and 12 established that cell-free massive MIMO eliminates cell boundaries and that user-centric clustering makes the architecture scalable. But we left a fundamental question unanswered: how should the distributed APs process the received signals? At one extreme, every AP forwards its raw baseband samples to the CPU, which performs centralized MMSE combining — optimal, but demanding enormous fronthaul capacity. At the other extreme, each AP applies local combining and forwards only a scalar estimate per user — minimal fronthaul, but suboptimal performance. This chapter develops the full spectrum between these extremes and identifies when each operating point is appropriate.

Definition:
Centralized Processing

Consider a cell-free network with $M$ APs, each equipped with $N$ antennas, serving $K$ single-antenna users. Let $\mathbf{y}_m \in \mathbb{C}^N$ denote the received signal at AP $m$ . In centralized processing, the CPU collects $\{\mathbf{y}_1, \ldots, \mathbf{y}_M\}$ and forms the network-wide received vector

$\mathbf{y} = \begin{bmatrix} \mathbf{y}_1 \\ \vdots \\ \mathbf{y}_M \end{bmatrix} \in \mathbb{C}^{MN}$

The CPU then applies a centralized combining vector $\mathbf{v}_{k} \in \mathbb{C}^{MN}$ to detect user $k$ :

$\hat{s}_k = \mathbf{v}_{k}^{H} \mathbf{y}$

The centralized MMSE combining vector is

$\mathbf{v}_{k}^{\text{c-MMSE}} = \left( \sum_{j=1}^{K} p_j \hat{\mathbf{g}}_j \hat{\mathbf{g}}_j^H + \mathbf{C}_{\mathbf{y}} + \sigma^2 \mathbf{I}_{MN} \right)^{-1} \hat{\mathbf{g}}_k$

where $\hat{\mathbf{g}}_k = [\hat{\mathbf{g}}_{1k}^T, \ldots, \hat{\mathbf{g}}_{Mk}^T]^T$ is the stacked channel estimate, and $\mathbf{C}_{\mathbf{y}}$ accounts for estimation error covariance.

Centralized MMSE is optimal in the sense that it maximizes the per-user SINR under the UatF framework. The price is that every AP must forward $N$ complex samples per channel use to the CPU.

Definition:
Distributed Processing

In distributed processing, each AP $m$ applies a local combining vector $\mathbf{a}_{mk} \in \mathbb{C}^N$ to its own received signal:

$\hat{s}_{mk} = \mathbf{a}_{mk}^H \mathbf{y}_m$

AP $m$ then forwards the scalar $\hat{s}_{mk}$ (one complex number per user) to the CPU. The CPU forms the final estimate by linearly combining the local estimates:

$\hat{s}_k = \sum_{m \in \mathcal{M}_k} \alpha_{mk} \, \hat{s}_{mk}$

where $\alpha_{mk}$ are weighting coefficients and $\mathcal{M}_k$ is the set of APs serving user $k$ .

Distributed processing reduces the fronthaul load from $MN$ complex samples to $|\mathcal{M}_k|$ complex scalars per user per channel use. The question is how much SINR we lose.

Centralized Processing

A cell-free processing architecture where all APs forward raw received signals (or sufficient statistics) to a central processing unit, which applies network-wide combining. Achieves the best SINR but requires the highest fronthaul capacity.

Distributed Processing

A cell-free processing architecture where each AP applies local combining and forwards only scalar estimates to the CPU. Minimizes fronthaul load at the cost of suboptimal interference suppression.

Theorem: Centralized MMSE SINR (UatF Bound)

Under centralized MMSE combining with the UatF bound, the uplink SINR of user $k$ is

$\text{SINR}_k^{(4)} = p_k \hat{\mathbf{g}}_k^H \left( \sum_{j \neq k} p_j \hat{\mathbf{g}}_j \hat{\mathbf{g}}_j^H + \sum_{j=1}^{K} p_j \mathbf{C}_j + \sigma^2 \mathbf{I}_{MN} \right)^{-1} \hat{\mathbf{g}}_k$

where $\mathbf{C}_j = \text{diag}(\mathbf{C}_{1j}, \ldots, \mathbf{C}_{Mj})$ is the block-diagonal estimation error covariance, $\mathbf{C}_{mj} = \beta_{mj} \mathbf{R}_{mj} - \gamma_{mj} \hat{\mathbf{R}}_{mj}$ , and $\hat{\mathbf{R}}_{mj}$ depends on the pilot scheme.

The centralized MMSE receiver sees the entire $MN$ -dimensional signal space and can optimally balance desired signal amplification against interference suppression across all APs simultaneously. This is the same MMSE receiver as in co-located massive MIMO, but now the antenna elements are distributed across the coverage area.

Proof

Network-wide signal model

Stack the received signals: $\mathbf{y} = \sum_{k=1}^{K} \sqrt{p_k} \, \mathbf{g}_k s_k + \mathbf{w}$ where $\mathbf{g}_k = [\mathbf{g}_{1k}^T, \ldots, \mathbf{g}_{Mk}^T]^T \in \mathbb{C}^{MN}$ .

Decompose via UatF

Write $\mathbf{g}_k = \hat{\mathbf{g}}_k + \tilde{\mathbf{g}}_k$ where $\hat{\mathbf{g}}_k$ is the MMSE estimate and $\tilde{\mathbf{g}}_k$ is the estimation error, uncorrelated with $\hat{\mathbf{g}}_k$ . The UatF bound treats $\hat{\mathbf{g}}_k$ as the true channel and $\tilde{\mathbf{g}}_k$ as additional noise.

Apply MMSE combining

The MMSE combining vector maximizes the SINR under the effective noise model: $\mathbf{v}_{k} = \left( \sum_{j=1}^{K} p_j \hat{\mathbf{g}}_j \hat{\mathbf{g}}_j^H + \sum_{j=1}^{K} p_j \mathbf{C}_j + \sigma^2 \mathbf{I} \right)^{-1} \hat{\mathbf{g}}_k$

Compute the SINR

Substituting the MMSE combiner into the SINR expression and applying the matrix inversion lemma yields the stated result. The key identity is $\mathbf{v}_{k}^{H} \hat{\mathbf{g}}_k = p_k \hat{\mathbf{g}}_k^H \mathbf{Q}_k^{-1} \hat{\mathbf{g}}_k$ where $\mathbf{Q}_k$ excludes user $k$ 's desired signal term. $\blacksquare$

Theorem: Distributed Processing SINR with Weighted Combining

Under distributed processing with local combining vectors $\{\mathbf{a}_{mk}\}$ and CPU weights $\{\alpha_{mk}\}$ , the UatF SINR of user $k$ is

$\text{SINR}_k^{\text{dist}} = \frac{p_k \left| \sum_{m \in \mathcal{M}_k} \alpha_{mk} \, \mathbb{E}[\mathbf{a}_{mk}^H \mathbf{g}_{mk}] \right|^2}{\sum_{j=1}^{K} p_j \sum_{m \in \mathcal{M}_k} |\alpha_{mk}|^2 \, \text{Var}(\mathbf{a}_{mk}^H \mathbf{g}_{mj}) + \sigma^2 \sum_{m \in \mathcal{M}_k} |\alpha_{mk}|^2 \, \mathbb{E}[\|\mathbf{a}_{mk}\|^2]}$

where the expectation is over small-scale fading. The denominator separates into beamforming gain uncertainty, inter-user interference, and noise amplification.

Each AP provides a noisy estimate of user $k$ 's symbol. The quality of these estimates varies across APs due to different path losses and interference environments. The CPU weights $\alpha_{mk}$ should emphasize APs with strong, reliable estimates and de-emphasize APs with weak or interference-dominated signals.

Proof

Local estimate statistics

The local estimate at AP $m$ is $\hat{s}_{mk} = \mathbf{a}_{mk}^H \mathbf{y}_m$ . Taking the conditional expectation given the channel estimates: $\mathbb{E}[\hat{s}_{mk} | \hat{\mathbf{g}}_{mk}] = \sqrt{p_k} \, \mathbb{E}[\mathbf{a}_{mk}^H \mathbf{g}_{mk}] \, s_k$ where we used the UatF approach of treating $\hat{\mathbf{g}}_{mk}$ as deterministic.

SINR derivation

The CPU forms $\hat{s}_k = \sum_m \alpha_{mk} \hat{s}_{mk}$ . The desired signal power is $p_k |\sum_m \alpha_{mk} \mathbb{E}[\mathbf{a}_{mk}^H \mathbf{g}_{mk}]|^2$ . The interference-plus-noise power is the variance of the remaining terms, which decomposes AP-by-AP because the noise across APs is independent.

Conclude

Collecting terms yields the stated SINR. The block-diagonal structure of the interference covariance (across APs) is what makes distributed processing tractable — the interference terms factorize. $\blacksquare$

Centralized vs. Distributed Processing

Aspect	Centralized (Level 4)	Distributed (Level 1–3)
Fronthaul per AP	$N$ complex samples per channel use	$\|\mathcal{D}_m\|$ complex scalars per channel use
CPU computation	$O((MN)^2 K)$ for MMSE	$O(M K)$ for weighted sum
Interference suppression	Network-wide: suppresses all inter-user interference jointly	Local: suppresses only intra-AP interference
CSI requirement at CPU	$\hat{\mathbf{g}}_{mk}$ for all $m, k$	Only large-scale statistics (for LSFD weights)
Scalability	Poor: matrix inversion scales with $MN$	Good: per-AP computation bounded
Performance	Optimal (MMSE bound)	Depends on cooperation level (L1 < L2 < L3 < L4)

Example: Fronthaul Load: Centralized vs. Distributed

Consider a cell-free network with $M = 100$ APs, each with $N = 4$ antennas, serving $K = 20$ users. Each AP serves a user-centric cluster of size $|\mathcal{D}_m| = 10$ users on average. Compare the fronthaul load per coherence block of $\tau_c = 200$ samples for centralized and distributed processing. Assume 32-bit floating point for real and imaginary parts (64 bits per complex sample).

Solution

Centralized fronthaul

Each AP forwards its full received signal: $N = 4$ complex samples per channel use. Per coherence block: $4 \times 200 = 800$ complex samples per AP. Total across all APs: $100 \times 800 = 80{,}000$ complex samples. In bits: $80{,}000 \times 64 = 5.12$ Mbit per coherence block.

Distributed fronthaul

Each AP forwards one scalar estimate per served user: $|\mathcal{D}_m| = 10$ scalars per channel use. Per coherence block: $10 \times 200 = 2{,}000$ complex samples per AP. Total: $100 \times 2{,}000 = 200{,}000$ complex samples.

Wait — this is more than centralized? The issue is that we count per-user scalars for all $\tau_c$ channel uses. In practice, the distributed approach forwards only the data portion ( $\tau_c - \tau_p$ samples), and the scalar estimates can be quantized with fewer bits. With $\tau_p = 20$ pilot samples and 16-bit quantization: Distributed: $10 \times 180 \times 32 = 57{,}600$ bits per AP. Centralized: $4 \times 200 \times 64 = 51{,}200$ bits per AP.

The real comparison

The fronthaul advantage of distributed processing becomes decisive when $N$ is large (many antennas per AP) or when aggressive quantization is used. For $N = 8$ antennas: Centralized: $8 \times 200 \times 64 = 102{,}400$ bits per AP. Distributed (16-bit): $10 \times 180 \times 32 = 57{,}600$ bits per AP. The ratio grows linearly with $N$ : distributed processing saves a factor of roughly $N / |\mathcal{D}_m|$ in fronthaul when $N > |\mathcal{D}_m|$ .

Centralized vs. Local MMSE: Per-User SINR CDF

Compare the cumulative distribution of per-user SINR under centralized MMSE (Level 4) and local MMSE (Level 2) combining. Adjust the number of APs and antennas per AP to observe how the performance gap changes with network density.

Parameters

M

(APs)100

Number of access points

N

(antennas/AP)4

Antennas per access point

K

(users)20

Number of users

|\mathcal{M}_k|

20

User-centric cluster size

Common Mistake: Centralized Processing Is Not Always Worth the Cost

Mistake:

Assuming that centralized MMSE (Level 4) is always the right choice because it maximizes SINR.

Correction:

The SINR gain from centralized processing shrinks as the number of antennas per AP increases. With $N \geq 4$ antennas, local MMSE at each AP already suppresses most intra-cluster interference. The remaining gap to centralized MMSE may not justify the $N \times$ increase in fronthaul load. The system designer must evaluate the performance-fronthaul tradeoff for the specific deployment scenario.

Historical Note: From Cloud-RAN to Cell-Free: The Distributed Processing Journey

2010–2020

The idea of centralizing baseband processing appeared in the Cloud-RAN (C-RAN) architecture proposed by China Mobile Research Institute around 2010. In C-RAN, remote radio heads (RRHs) forward digitized baseband signals to a centralized baseband unit (BBU) pool via high-capacity fronthaul links (typically CPRI over fiber). The cell-free massive MIMO paradigm, introduced by Ngo et al. in 2017, can be viewed as a Cloud-RAN where the RRHs are single-antenna (or few-antenna) APs deployed at very high density. The evolution from Cloud-RAN to cell-free to user-centric cell-free mirrors the engineering community's gradual recognition that full centralization does not scale — the question has always been: how much to centralize? The four cooperation levels formalized by Bjornson and Sanguinetti in 2020 provide the definitive answer to this question.

Quick Check

In a cell-free network with $M = 50$ APs, $N = 4$ antennas per AP, and $K = 10$ users, what is the per-AP fronthaul dimension for centralized processing (complex samples per channel use)?

$10$ (one per user)

$4$ (one per antenna)

$40$ ( $N \times K$ )

$200$ (full $MN$ dimension)

Correction:

4

(one per antenna)

Correct. In centralized processing, AP $m$ forwards its entire received vector $\mathbf{y}_m \in \mathbb{C}^4$ , so the fronthaul carries 4 complex samples per channel use.

Why This Matters: O-RAN Functional Splits and Cooperation Levels

The cooperation levels map directly to O-RAN functional split options. Level 4 (centralized MMSE) corresponds to Split 7.2x where the O-RU forwards frequency-domain IQ samples — the analog of our $\mathbf{y}_m$ . Level 2 (local MMSE) corresponds to a higher split (e.g., Split 6) where the O-RU performs local equalization and forwards soft bits or symbol estimates. The O-RAN Alliance's ongoing work on cell-free RAN explicitly addresses these tradeoffs. In practice, the fronthaul technology (fiber, millimeter-wave wireless, or Ethernet) determines which split is feasible, and therefore which cooperation level is achievable.

Key Takeaway

Centralized vs. distributed processing is not a binary choice. The four cooperation levels (L1–L4) provide a continuum from fully local to fully centralized processing. The optimal operating point depends on the fronthaul capacity, the number of antennas per AP, and the interference environment. As a rule of thumb: invest in centralized processing when APs have few antennas ( $N = 1$ – $2$ ) and the network is interference-limited; use distributed processing when APs are well-equipped ( $N \geq 4$ ) and fronthaul is the bottleneck.

Centralized vs. Distributed Processing

The Central Question of Cell-Free Processing

Definition: Centralized Processing

Definition: Distributed Processing

Centralized Processing

Distributed Processing

Theorem: Centralized MMSE SINR (UatF Bound)

Network-wide signal model

Decompose via UatF

Apply MMSE combining

Compute the SINR

Theorem: Distributed Processing SINR with Weighted Combining

Local estimate statistics

SINR derivation

Conclude

Centralized vs. Distributed Processing

Example: Fronthaul Load: Centralized vs. Distributed

Centralized fronthaul

Distributed fronthaul

The real comparison

Centralized vs. Local MMSE: Per-User SINR CDF

Parameters

Common Mistake: Centralized Processing Is Not Always Worth the Cost

Historical Note: From Cloud-RAN to Cell-Free: The Distributed Processing Journey

Quick Check

Why This Matters: O-RAN Functional Splits and Cooperation Levels

Key Takeaway

Definition:
Centralized Processing

Definition:
Distributed Processing