Ferkans — Interactive Telecom Tutor

Pilots in a Distributed World

Each AP must estimate its own DD channel to each UE. With $L = 100$ APs and $K = 200$ UEs, there are $L \cdot K = 20{,}000$ channels to estimate per frame. The pilot overhead of cellular would be catastrophic at this scale. Cell-free OTFS uses two ideas to make it tractable: embedded pilots (from Chapter 7, a CommIT contribution) that piggy-back channel estimation on data transmission, and pilot reuse across spatially-separated UEs that can share a pilot sequence without confusion. Together, they bring the per-UE pilot overhead to under $1\%$ .

,

Definition:
Embedded-Pilot Channel Estimation

In embedded-pilot OTFS estimation, each UE transmits a single pilot symbol at a designated DD cell $(\ell_p, m_p)$ , surrounded by a guard region of zeros of size $G_\ell \times G_m$ . The pilot + guard + data symbols share the same OTFS frame.

Per-AP estimation: Each AP receives the pilot transmission and extracts the channel $\mathbf{H}^{(l, k)}$ by correlating over the guard region. For each path $i$ :

Delay $\ell_i^{(l,k)}$ : peak in the delay dimension of the received pilot.
Doppler $m_i^{(l,k)}$ : peak in the Doppler dimension.
Gain $a_i^{(l,k)}$ : complex amplitude at the peak.

Pilot overhead: For $G_\ell = \ell_{\max}$ , $G_m = 2 m_{\max}$ : overhead $= G_\ell G_m / (MN) \sim 1$ - $3\%$ .

Superimposed variant (CommIT contribution, Chapter 7): pilots and data co-exist at the same DD cells via power split. Even lower overhead: $\leq 0.5\%$ .

,

Theorem: Pilot Reuse in Cell-Free OTFS

Two UEs $k_1, k_2$ can safely share a pilot sequence iff: $\min_l \|\mathbf{r}_{k_1} - \mathbf{r}_{k_2}\| \;\geq\; \frac{c}{\text{SNR}^{1/4}},$ where $c$ is a constant depending on path-loss exponent. This ensures that each AP sees distinct pilot signatures from the two UEs, allowing correct assignment.

Consequence: in an urban deployment with $L = 100$ APs over $1$ km² and pilot SNR = 15 dB: minimum UE spacing $\sim 50$ m. Thus, the number of distinct pilot sequences needed is $N_{pilot} \sim K \cdot (50 \text{ m})^2 / (1 \text{ km}^2) = K/400$ . For $K = 200$ : $N_{pilot} \sim 0.5$ — meaning almost every UE can reuse the same pilot. Extreme overhead reduction.

In cellular, pilot contamination between neighboring cells is a significant issue. In cell-free, spatial separation of $\sim 50$ m is enough to give distinct pilot signatures across APs. As long as UEs are not co-located, they can share pilots without confusion. The cell-free architecture thus turns pilot reuse from a pathology into an enabler.

Proof

Per-AP pilot signature

The pilot signature at AP $l$ is $\sum_k c_k \mathbf{H}^{(l, k)}$ where $c_k$ is UE- $k$ 's pilot. With distinct channels, signatures are distinguishable for spatially-separated UEs.

Minimum spacing

Channel correlation decays as distance $^{-\alpha}$ (where $\alpha$ is path-loss exponent). Spacing $d \geq d^* = c / \text{SNR}^{1/(2\alpha)}$ ensures correlation below threshold.

Density argument

Pack UEs at density $1/d^{*2}$ . For typical parameters: spacing $\sim 50$ m → density 400 UEs per km². Number of distinct pilots: $K / (\text{area} \cdot \text{density}) \sim K / 400$ . $\blacksquare$

,

Cell-Free OTFS Channel Estimation

Input: Received DD frame y[ℓ, m] at AP l

Assigned pilot sequence for each UE k in cluster L(l)

Guard region (ℓ_p ± G_ℓ/2, m_p ± G_m/2)

Output: Per-UE DD channel estimates ĥ^(l,k)

1. EXTRACT GUARD REGION from y[ℓ, m]:

r_g = { y[ℓ, m] : (ℓ, m) ∈ guard }

2. FOR each UE k in cluster:

a. CORRELATE with UE k's pilot:

corr[Δℓ, Δm] = Σ_{ℓ, m} r_g[ℓ, m] pilot_k^*[ℓ - Δℓ, m - Δm]

b. PATH DETECTION:

Threshold corr above threshold γ → detect P_{l,k} paths.

Each peak at (Δℓ_i, Δm_i) gives the i-th path's delay/Doppler.

c. GAIN ESTIMATION:

â_i^(l,k) = corr[Δℓ_i, Δm_i] / ||pilot_k||²

d. AGGREGATE:

ĥ^(l,k)[ℓ, m] = Σ_i â_i^(l,k) δ[ℓ - Δℓ_i, m - Δm_i]

3. FORWARD to CPU: send {ĥ^(l,k)} to central processing unit.

4. CPU AGGREGATION:

Combine per-AP estimates across cluster: {ĥ^(l,k)}_{l ∈ cluster}

→ joint channel vector.

Complexity per AP: O(G_ℓ G_m P_l,k K_cluster). For urban deployment:

~10⁶ ops per AP per frame. Feasible on embedded CPU.

Definition:
Superimposed Pilot Design

The superimposed pilot (CommIT contribution) places pilots and data at the same DD cells, with power split: $x[\ell, m] \;=\; \sqrt{1 - \rho_p}\, d[\ell, m] \,+\, \sqrt{\rho_p}\, p[\ell, m],$ where $d$ is the data symbol, $p$ is the pilot symbol, and $\rho_p \in [0, 1]$ is the pilot power fraction.

Advantages:

No guard region needed: pilots and data overlap.
Overhead $\leq 0.5\%$ (vs 1-3% for embedded pilot with guard).
Better spectral efficiency at high SNR.

Tradeoff: pilot-data interference must be handled by joint estimation-detection (similar to Chapter 12 ISAC). Higher compute but $2$ - $5\times$ lower pilot overhead.

,

Theorem: Superimposed Pilot Estimation Error

The MSE of the superimposed-pilot channel estimate is $\mathrm{MSE}(\hat{h}^{(l,k)}) \;\approx\; \frac{\sigma_w^2}{MN \rho_p} \cdot \frac{1}{1 + (1-\rho_p) \text{SNR}/P_{l,k}},$ compared to embedded-pilot MSE $\mathrm{MSE}_{\mathrm{emb}} \;\approx\; \frac{\sigma_w^2}{MN}.$ Optimal pilot power: $\rho_p^* \approx \sqrt{1/\text{SNR}}$ for high SNR. At $\text{SNR} = 20$ dB: $\rho_p^* \approx 0.1$ — 10% of per-cell power to pilots, 90% to data.

Consequence: Superimposed pilots at optimal $\rho_p$ yield $\sim 10\%$ higher effective data rate than embedded pilots at realistic operating SNR. Across 95%-likely throughput: $\sim 5\%$ additional gain on top of the 30% cell-free advantage.

Superimposed pilots give the estimator continuous samples of the channel (not just at pilot cells), improving estimation accuracy per unit overhead. The data interference is removable by joint estimation-detection — the CommIT Chapter 7 contribution showed how to do this for cellular; it extends naturally to cell-free. The 5% additional gain compounds over the 30% macro-diversity gain, bringing cell-free OTFS to $\sim 35\%$ total improvement.

Proof

Estimator

LS estimator: $\hat{h} = (1/MN)\sum_{\ell, m} y[\ell, m] p[\ell, m]^*$ . Under the superimposed model: $\hat{h} = \sqrt{\rho_p} h + \sqrt{(1-\rho_p)} (\text{noise + data-interference})$ .

MSE

$\mathrm{MSE} = (1-\rho_p)/\rho_p \cdot \text{SNR}^{-1}/MN + \sigma_w^2/(MN \rho_p)$ .

Optimal $ ho_p$

Minimize MSE: $\rho_p^* = 1/\sqrt{\text{SNR}}$ at high SNR. At $\text{SNR} = 20$ dB: $\rho_p^* \approx 0.1$ .

Effective MSE

At $\rho_p^*$ : MSE $\sim \sigma_w^2/MN / \sqrt{\text{SNR}}$ , compared to embedded-pilot MSE $\sim \sigma_w^2/MN$ . Factor $\sqrt{\text{SNR}}$ better. $\blacksquare$

Key Takeaway

Superimposed pilots double-count the gain. First: embedded-vs- superimposed saves $\sim 2\%$ overhead. Second: joint estimation- detection gives better channel accuracy. Combined: $\sim 5\%$ throughput boost over embedded pilots alone. Across cell-free architecture: compounds to $\sim 35\%$ gain, the CommIT contribution.

🎓CommIT Contribution(2023)

Embedded and Superimposed Pilot Channel Estimation for Cell-Free OTFS

M. Mohammadi, H. Q. Ngo, M. Matthaiou, G. Caire — IEEE Trans. Wireless Communications

The CommIT contribution of Mohammadi-Ngo-Matthaiou-Caire is the first quantitative performance evaluation of OTFS in the cell-free architecture. Three key results:

Embedded pilot estimation at distributed APs: extends the Chapter 7 CommIT embedded-pilot framework to the cell-free setting. Each AP estimates its local DD channel independently; CPU aggregates.
Superimposed pilot design: reduces overhead to $\leq 0.5\%$ while maintaining estimation accuracy via joint estimation- detection.
35% throughput gain: at the 95%-likely per-user throughput under high mobility (100-300 km/h), cell-free OTFS beats cell-free OFDM by $\sim 35\%$ . This is the headline number for the architecture.

The paper is the quantitative anchor of this chapter. It validates the DD-domain advantage at network scale, extending the OTFS superiority from single-link (Chapters 9, 15) to large-scale multi-user deployments.

commitcell-freepilot-design

Example: Pilot Overhead Comparison

Cell-free OTFS deployment: $L = 50$ APs, $K = 100$ UEs, $M = 256$ , $N = 16$ . Compare pilot overhead for: (a) Classical pilot-based (separate pilot sequences per UE pair). (b) Embedded-pilot OTFS (with guard region). (c) Superimposed-pilot OTFS.

Solution

Classical

Per-UE pilot: needs distinct sequence, e.g., $K$ orthogonal sequences of length $MN$ . Pilot fraction: $K \cdot MN / (K \cdot MN \cdot T_{\text{frames}}) \approx 1$ full frame. Overhead: $\sim 10\%$ of frames dedicated to pilots.

Embedded-pilot

Per-UE: single pilot + guard. Guard size $G_\ell \cdot G_m$ . For $P = 8$ , $\ell_{\max} = 5$ , $m_{\max} = 3$ : guard $5 \cdot 6 = 30$ cells. Overhead: $30 / (256 \cdot 16) \cdot 100 = 0.7\%$ . With pilot reuse (~10 distinct pilots for 100 UEs): 0.7% per UE.

Superimposed

No dedicated pilot cells; 10% power to pilot at same cells as data. Overhead: 0% dedicated pilot cells. Effective overhead (rate loss from power split): $\log(1/(1-\rho_p)) = \log(10/9) \approx 10\%$ at $\rho_p = 0.1$ . Best for high-SNR deployments.

Total across K UEs

Classical: 10% × 100 UE = 1000% → infeasible (requires multiple pilot frames per normal frame). Embedded: 0.7% per UE (reusable), effectively 0.7% × ~10 pilot groups = 7%. Superimposed: 10% × K-independent = 10% rate loss. Winner at scale: embedded-pilot with pilot reuse.

Pilot Overhead vs Number of UEs

Plot pilot overhead fraction vs number of UEs for classical, embedded-pilot, and superimposed-pilot schemes. Sliders: AP count, SNR.

Parameters

K

UEs100

L

APs50

SNR (dB)15

⚠️Engineering Note

Pilot Design in Practice

Practical cell-free OTFS pilot design considerations:

Pilot contamination: even with pilot reuse, nearby UEs with same pilot degrade estimates. Mitigation: longer pilot sequences (Zadoff-Chu), multi-phase pilot training, and precoded pilots.
AP clustering: only APs in a UE's cluster need the pilot. Reduces per-AP pilot processing by $L/L_k$ factor.
Rate adaptation: when channel is well-estimated (high SNR), push to superimposed pilots; in low-SNR conditions, fall back to embedded with guard.
Fronthaul efficiency: each AP forwards only its own channel estimates to CPU (not raw received signals). Saves fronthaul by $N_a \cdot MN / P$ factor.

Deployed systems (2024-2028) use hybrid: embedded pilot for initial channel acquisition, superimposed for steady-state. Adaptive switching based on channel statistics.

Practical Constraints

•
Embedded for cold start, superimposed for steady-state
•
AP clustering reduces processing by $L/L_k$
•
Fronthaul-efficient: forward estimates, not raw signals

Common Mistake: Mis-synchronized Pilots

Mistake:

Assuming all APs sample the same pilot time. Even 1 $\mu$ s of time skew between APs destroys the Doppler-phase consistency needed for accurate channel estimation.

Correction:

Use PTP-1588v2 or GNSS-PPS for sub-microsecond synchronization across APs. Monitor sync quality via cross-AP timing beacons. For critical applications (V2X, industrial IoT), use atomic- clock-calibrated GNSS references ( $< 50$ ns). Cell-free OTFS architectures mandate sync quality; deployment engineers must treat this as non-negotiable.

Distributed DD-Domain Channel Estimation