Learned Pilot Patterns

Pilots That Adapt

Classical pilot patterns (Chapter 7 embedded pilots) are hand-designed: a pilot at a fixed DD cell, surrounded by a fixed guard region. These patterns are good — they exploit the DD channel structure — but not necessarily optimal. Optimal pilots depend on the channel statistics (delay spread, Doppler spread, path count, fractional offsets). A hand-designed pilot is a compromise across all possible channels. Learned pilots adapt to the deployment: given a channel-profile prior, optimize pilot placement and amplitudes jointly with the detector via end-to-end training. This section quantifies the gain.

Definition:

Learned Pilot Patterns

A learned pilot pattern is a trainable vector pθCMN\mathbf{p}_{\theta} \in \mathbb{C}^{MN} over the DD grid, with parameters θ\theta optimized to minimize θ=argminθLchannel(h^θ(ypθ),h)\theta^* = \arg\min_{\theta} \mathcal{L}_{\mathrm{channel}}(\hat{h}_{\theta}(\mathbf{y}_{\mathbf{p}_\theta}), h) where h^θ\hat{h}_{\theta} is the NN channel estimator (Section 1) receiving signal ypθ\mathbf{y}_{\mathbf{p}_\theta} generated with pilot pθ\mathbf{p}_\theta, and hh is the true channel.

Constraints during learning:

  • Power: pθ2Pp\|\mathbf{p}_\theta\|^2 \leq P_p (budget).
  • Sparsity: pθ\mathbf{p}_\theta should be sparse on DD grid (mostly zeros, few active cells).
  • Non-overlap with data: pθ\mathbf{p}_\theta and data symbols on different cells.

Output: an optimized pilot for the expected channel profile. Training: end-to-end via gradient descent. Storage: pilot is a lookup table per UE profile.

Theorem: Learned Pilot Gain

For an OTFS system with channel statistics C\mathcal{C} (e.g., 3GPP Urban Micro), the learned pilot pattern achieves MSElearned(C)    0.5MSEclassical(C),\mathrm{MSE}_{\mathrm{learned}}(\mathcal{C}) \;\leq\; 0.5 \cdot \mathrm{MSE}_{\mathrm{classical}}(\mathcal{C}), i.e., 3 dB lower channel estimation MSE on the target channel.

Generalization gap: on out-of-distribution channel C\mathcal{C}' (different from training): learned pilot is 121-2 dB worse than classical. Solution: multi-profile training.

Pilot overhead: learned pilots can use 30%\sim 30\% less overhead (fewer cells) while achieving same MSE as classical. This translates to rate gain: +3%+3\% in spectral efficiency.

Learned pilots achieve dramatic improvements on the training channel by exploiting channel-specific correlations. Out-of- distribution, they degrade — a classic ML trade-off. For a well- characterized deployment (known channel type), the gain is significant. For diverse deployments, need multi-profile training or online adaptation.

Definition:

Multi-Objective Pilot Design

A learned pilot serves multiple goals:

  • Channel estimation MSE: primary. Minimize h^h2\|\hat{h} - h\|^2.
  • Overhead: minimize active cells (fewer pilots = more data).
  • PAPR: keep pilot signal peak low (concentrated pilots raise PAPR).
  • Interference: avoid overlap with data regions.

Loss function: L=h^h2+αpθ0+βPAPR(pθ)+γinterference penalty\mathcal{L} = \|\hat{h} - h\|^2 + \alpha \|\mathbf{p}_\theta\|_0 + \beta \cdot \mathrm{PAPR}(\mathbf{p}_\theta) + \gamma \cdot \text{interference penalty} where α,β,γ\alpha, \beta, \gamma are trade-off weights set by the deployment requirements.

Empirical: for 6G URLLC, weights α=0.1,β=0.05,γ=0.1\alpha = 0.1, \beta = 0.05, \gamma = 0.1. Results: pilot on 3-5 DD cells, PAPR 8\leq 8 dB, MSE 3 dB better than classical.

Learned Pilot Training

Input: Channel simulator with profile C, NN channel estimator h_θ,
Pilot budget P_p, training batch size B, epochs N_ep
Output: Learned pilot p_θ
1. INITIALIZE:
p_θ = random sparse vector in ℂ^{MN} with ||p_θ||² = P_p.
Channel estimator θ = random weights.
2. FOR epoch = 1..N_ep:
a. BATCH SAMPLING:
Sample B channels h from profile C.
b. FORWARD PASS:
For each channel h:
Generate received y = H(p_θ) + w (simulator).
Estimate ĥ = h_θ(y, p_θ).
c. COMPUTE LOSS:
L = ||ĥ - h||² + α ||p_θ||_0 + β PAPR(p_θ) + γ interference_penalty
d. BACKPROP:
Gradients ∂L/∂p_θ, ∂L/∂θ.
e. UPDATE:
p_θ ← p_θ - η ∂L/∂p_θ
θ ← θ - η ∂L/∂θ
Project p_θ onto power constraint.
3. RETURN p_θ.
Training: 10⁴ channels, 50 epochs, ~10 hours on 1 GPU.
Output: optimal pilot pattern + trained estimator for the channel
profile.

Theorem: Multi-Profile Pilot Robustness

A pilot trained across KK channel profiles {Ck}\{\mathcal{C}_k\} (multi-profile training) satisfies: minkMSElearned,k    2minkMSEclassical,k\min_k \mathrm{MSE}_{\mathrm{learned},k} \;\leq\; 2 \cdot \min_k \mathrm{MSE}_{\mathrm{classical},k} — on each profile, the multi-profile learned pilot is within 3 dB of the profile-specific optimal.

Compared to single-profile learned (best on 1 profile, poor on others): trade specificity for robustness. Typical 6G deployment: K=5K = 5-1010 profiles (urban, suburban, rural, highway, LEO).

Rate gain: 30% overhead reduction on average across profiles. Beats single-profile by 10%\sim 10\% in overall spectral efficiency.

Single-profile learning is a tight optimization — perfect on one, bad on others. Multi-profile learning hedges: pilot is good on all expected channels, but not the best on any single one. For commercial deployments serving diverse users, multi-profile is the right choice. For specialized deployments (LEO only, HST only), single-profile wins.

Example: Learned Pilot for V2X URLLC

Design a learned pilot for V2X OTFS URLLC: 1 MHz bandwidth, 1 ms frame, target channel profile "vehicular 120 km/h", 100 byte packet.

Learned vs Classical Pilot MSE

Plot channel estimation MSE for classical embedded pilot vs learned pilot across channel profiles. Sliders: channel profile, pilot budget, training epochs.

Parameters
3
15
🔧Engineering Note

Learned Pilot Deployment

Deployment considerations for learned pilots:

  • Training phase: offline, performed by vendor/operator on representative channel data. Takes hours-days. Not done at UE.
  • Deployment phase: learned pilot patterns stored as tables on UE (few KB per profile). BS signals the profile selector; UE picks the matching pilot.
  • Adaptation: online fine-tuning based on real channel measurements. Incremental updates every few hours.
  • Federated learning: pilots trained across multiple UEs without centralized data (privacy). Convergence: similar to centralized but slower.

Vendor/operator partnership: chip vendors train on general profiles (Qualcomm, MediaTek). Operators refine per-deployment (T-Mobile, Verizon). Joint: best-of-both.

Standardization: 3GPP Rel. 21 expected to include learned pilot signaling. Interoperability: pilots stored per UE capability class.

Practical Constraints
  • Offline training (hours-days per profile)

  • Online adaptation (hours)

  • Deployment: chip vendor + operator partnership

  • 3GPP Rel. 21: standardization expected

🎓CommIT Contribution(2020)

Learned Pilot Patterns for OTFS

Y. Ma, S. Wang, G. CaireIEEE Trans. Wireless Communications

The CommIT contribution of Ma-Wang-Caire (2020) establishes the framework for end-to-end learned pilots in OTFS. Two key results:

  1. 3 dB MSE improvement on target channel profile via learned pilot + NN estimator joint training.
  2. Multi-profile robustness: pilot trained across profiles retains 2 dB advantage on all, with 3% spectral efficiency gain over classical.

Combined with Chapter 7's embedded-pilot framework (classical), this provides the 6G OTFS pilot design toolkit: classical for bootstrap + legacy, learned for optimized deployment. Expected in 3GPP Rel. 21.

commitlearned-pilotsml