Ferkans — Interactive Telecom Tutor

One Waveform, Two Estimates

The previous section treated positioning as a problem layered on top of communication: estimate the channel, then estimate the TOA, then estimate the position. This ordering leaves information on the table. The data symbol sequence $\mathbf{s}$ is itself modulated by the channel $\alpha_l$ and delayed by $\tau_l$ , so the received samples carry information about $\mathbf{s}$ and $\mathbf{p}$ simultaneously. Running two independent decoders (one for data, one for position) is the statistical equivalent of measuring the same thing twice and throwing away one measurement.

Joint detection and positioning exploits the coupling: a position estimate pins down the phase/amplitude structure of the composite channel, which tightens data detection; meanwhile, soft data decisions refine the timing estimate, which tightens positioning. The CommIT group has shown that in cell-free deployments this joint processing closes a 3-5 dB gap to the information-theoretic limit at moderate SNR.

Definition:
Joint Likelihood of Data and Position

Consider a user at position $\mathbf{p}$ transmitting a block of data symbols $\mathbf{s} = (s_1, \ldots, s_M)$ via a known pulse $g(t)$ :

$x(t) = \sum_{m=1}^{M} s_m g(t - m T_s)$

The baseband observation at AP $l$ is

$\mathbf{y}_l(t) = \alpha_l \hat{\mathbf{a}}(\phi_l) \sum_m s_m g(t - m T_s - \tau_l) + \mathbf{w}_{l}(t)$

Stacking the sampled observations across APs into $\mathbf{y} = [\mathbf{y}_1^T, \ldots, \mathbf{y}_L^T]^T$ , the joint log-likelihood is

$\mathcal{L}(\mathbf{s}, \mathbf{p}) = -\frac{1}{\sigma^2} \sum_{l=1}^{L}\!\int\!\left\|\mathbf{y}_l(t) - \alpha_l(\mathbf{p}) \hat{\mathbf{a}}(\phi_l(\mathbf{p})) \sum_m s_m g(t - m T_s - \tau_l(\mathbf{p}))\right\|^2 \! dt + \text{const}$

where $\tau_l(\mathbf{p}) = \|\mathbf{p} - \mathbf{q}_l\|/c$ , $\phi_l(\mathbf{p})$ is the geometry-induced AOA, and $\alpha_l(\mathbf{p})$ absorbs the path-loss.

The appearance of $\mathbf{p}$ inside $\tau_l$ , $\phi_l$ , and $\alpha_l$ simultaneously is what makes joint detection and positioning a nontrivial optimization: the likelihood is nonconvex in $\mathbf{p}$ (the delay parameter enters via a highly oscillatory sinc function) and bilinear in $\mathbf{s}$ and the effective channel.

Definition:
Decoupled (Iterative) Detection-Positioning

A decoupled scheme alternates two independent subproblems:

Channel/data estimation step. Fix a current position estimate $\hat{\mathbf{p}}^{(k)}$ . Compute the predicted delays, angles, and path losses. Use the standard cell-free MMSE detector to extract the data symbols $\hat{\mathbf{s}}^{(k+1)}$ and channel amplitudes $\hat{\alpha}_l^{(k+1)}$ .
Position estimation step. Fix the data $\hat{\mathbf{s}}^{(k+1)}$ . Treat the detected symbols as a known pilot and estimate the delays $\hat{\tau}_l^{(k+1)}$ by matched filtering. Run multilateration on $\{\hat{\tau}_l^{(k+1)}\}$ to get $\hat{\mathbf{p}}^{(k+1)}$ .

Iterate until convergence. This scheme is simple, modular (both subproblems are standard cell-free operations), and inherits the convergence guarantees of block coordinate descent under mild conditions. Its weakness is that it does not exchange soft information between the two subproblems — hard data decisions propagate errors into the TOA estimator at low SNR.

Definition:
Joint (EM) Detection-Positioning

A joint scheme based on expectation-maximization (EM) updates $\mathbf{p}$ while marginalizing over the data:

E-step. Compute soft posterior probabilities of each data symbol $q_m^{(k)}(s) = P(s_m = s \mid \mathbf{y}, \hat{\mathbf{p}}^{(k)})$ using a cell-free MMSE detector with channel parameters evaluated at $\hat{\mathbf{p}}^{(k)}$ .

M-step. Maximize the expected log-likelihood

$\hat{\mathbf{p}}^{(k+1)} = \arg\max_{\mathbf{p}}\, \mathbb{E}_{q^{(k)}}\!\left[\mathcal{L}(\mathbf{s}, \mathbf{p})\right]$

which, under Gaussian symbols or soft QAM, reduces to a weighted nonlinear least-squares fit of the predicted timings against the soft-demodulated signal.

Unlike the decoupled scheme, EM propagates soft data posteriors into the position estimator, which dramatically improves behavior near and below the communication decoding threshold.

EM converges monotonically in the likelihood and, by the EM-MAP equivalence, to a local maximum of the joint likelihood. For cell-free systems the M-step is nonconvex in $\mathbf{p}$ , so multiple restarts or a coarse grid initialization (from the decoupled scheme's output) are standard.

,

Theorem: Joint Scheme Outperforms Decoupled Scheme

Let $\text{MSE}_{\text{joint}}(\mathbf{p})$ and $\text{MSE}_{\text{dec}}(\mathbf{p})$ denote the position mean-squared error achieved by the joint (EM) scheme and the decoupled iterative scheme, respectively. Then for any SNR and any deployment,

$\text{MSE}_{\text{joint}}(\mathbf{p}) \leq \text{MSE}_{\text{dec}}(\mathbf{p})$

with strict inequality whenever the symbol-error probability at the communication SNR operating point is nonzero. In particular, as $\text{SNR} \to \infty$ both schemes converge to the joint CRB, but at practical SNRs (0-15 dB) the joint scheme can be 3-6 dB tighter.

The decoupled scheme commits to hard symbol decisions before using them as a pilot. Any symbol error feeds the TOA estimator with a wrong "pilot," producing a biased position estimate. The joint scheme defers symbol commitment and instead weights all possible symbol hypotheses by their likelihood — this is a fundamental information- preserving operation that never makes things worse.

Proof

Data processing inequality for EM

The joint log-likelihood $\mathcal{L}(\mathbf{s}, \mathbf{p})$ has a marginalized version $\log p(\mathbf{y}|\mathbf{p}) = \log \sum_{\mathbf{s}} p(\mathbf{y},\mathbf{s}|\mathbf{p})$ . EM produces a sequence whose likelihood monotonically increases toward the marginal likelihood. The decoupled scheme maximizes a plug-in likelihood with $\mathbf{s}$ replaced by its hard estimate.

Jensen's inequality

For any distribution $q(\mathbf{s})$ , $\log \mathbb{E}_q[p(\mathbf{y}|\mathbf{s},\mathbf{p})] \geq \mathbb{E}_q[\log p(\mathbf{y}|\mathbf{s},\mathbf{p})]$ . Setting $q$ to a delta at the decoupled symbol estimate shows that the joint marginal likelihood is at least as large as the decoupled plug-in likelihood.

Asymptotic CRB match

At high SNR, symbol error goes to zero and both schemes coincide with the CRB. Both schemes are therefore asymptotically efficient; the joint scheme is strictly better at finite SNR. $\blacksquare$

,

EM for Joint Detection and Positioning

Complexity:

\mathcal{O}(K_{\max} \cdot L \cdot M \cdot |\mathcal{S}|)

per user

Input: Fronthaul observations

\mathbf{y}_1, \ldots, \mathbf{y}_L

at CPU, AP positions

\{\mathbf{q}_l\}

, max iterations

K_{\max}

Output: Position estimate

\hat{\mathbf{p}}

and data

\hat{\mathbf{s}}

1. Initialize

\hat{\mathbf{p}}^{(0)}

by coarse TDOA multilateration on uplink pilot

2. for

k = 0, 1, \ldots, K_{\max} - 1

do

3.

\quad

Compute delays

\tau_l^{(k)} = \|\hat{\mathbf{p}}^{(k)} - \mathbf{q}_l\|/c

4.

\quad

Form effective channel

\hat{\alpha}_l^{(k)}

from LS estimate

5.

\quad

E-step: Compute soft symbol posteriors

q_m^{(k)}(s) \propto \exp(-\|\mathbf{y}_l - \hat{h}^{(k)} s\|^2/\sigma^2)

6.

\quad

M-step: Solve

\hat{\mathbf{p}}^{(k+1)} = \arg\max_{\mathbf{p}} \mathbb{E}_{q^{(k)}}[\mathcal{L}(\mathbf{s},\mathbf{p})]

via nonlinear LS

7.

\quad

if

\|\hat{\mathbf{p}}^{(k+1)} - \hat{\mathbf{p}}^{(k)}\| < \epsilon

: break

8. end for

9.

\hat{\mathbf{s}} \leftarrow \arg\max_s q_m^{(K)}(s)

for each

m

10. return

\hat{\mathbf{p}}^{(K)}, \hat{\mathbf{s}}

The M-step is the most expensive substep: it is a 2D nonlinear least-squares problem that can be solved by Gauss-Newton starting from $\hat{\mathbf{p}}^{(k)}$ . Warm-starting drives convergence in 2-3 Gauss-Newton iterations.

Joint vs. Decoupled: Position RMSE vs SNR

Compare the position root-mean-square error of the joint (EM-based) and decoupled (alternating) schemes as a function of receive SNR. The decoupled scheme shows a high-error plateau at low SNR, caused by symbol errors feeding back into the timing estimator. The joint scheme degrades smoothly and tracks the CRB over a wider SNR range. At high SNR both converge to the bound.

Parameters

Number of APs12

Modulation

Bandwidth (MHz)100

Example: Two-AP Toy Example: When Does Joint Help?

Consider two APs and a single-symbol QPSK transmission. At SNR 0 dB the symbol error probability is roughly $0.15$ . If the decoupled scheme uses the hard-detected symbol for ranging, the TOA estimator sees a 15% chance of a completely wrong reference. Quantify how much this inflates the position MSE over the joint scheme.

Solution

Decoupled MSE

Conditioned on a correct symbol (prob $0.85$ ), the range MSE equals the CRB. Conditioned on a wrong symbol (prob $0.15$ ), the range estimate acquires a phase ambiguity equivalent to the symbol period $T_s$ . Over QPSK, a wrong symbol is often 90 degrees off, producing a bias roughly equal to $c \cdot T_s / (4 \cdot 2\pi f_c)$ at typical cellular frequencies — easily 3-5 meters.

Joint MSE

The joint scheme weights each of the 4 QPSK hypotheses by its likelihood and sums. Because the true symbol has the highest likelihood by construction, its contribution dominates, and the effective MSE is close to the correct-symbol MSE. Specifically, $\text{MSE}_{\text{joint}} \approx (1 - P_e^2) \cdot \text{CRB}$ , essentially the CRB.

Gain

At $P_e = 0.15$ , the decoupled MSE is roughly $0.85 \cdot \text{CRB} + 0.15 \cdot (3-5\,\text{m})^2$ , dominated by the bias term. The joint gain can be 10-20 dB in position MSE at the same communication SNR.

Why This Matters: From Joint Processing to ISAC as a System Paradigm

Joint detection and positioning is a microcosm of the broader Integrated Sensing and Communication (ISAC) philosophy: the same waveform, hardware, and spectrum must deliver both data and measurement simultaneously. For individual users, joint detection-positioning closes a 3-5 dB gap. For the system, ISAC reallocates the degrees of freedom between sensing and communication based on traffic demand — an adaptive tradeoff explored in §CRB and the Rate-PEB Tradeoff. For the network operator, it means a single cell-free infrastructure serves connectivity, positioning, and environment sensing without separate deployments.

Common Mistake: EM Can Converge to the Wrong Position

Mistake:

EM is advertised as a monotonic likelihood climber, so one assumes a single run is enough to recover the user position.

Correction:

The joint likelihood is nonconvex in $\mathbf{p}$ : the delay-phase oscillations from the $\sin(2\pi f_c \tau_l)$ factor create many local maxima spaced at multiples of the carrier period. EM converges to the nearest local maximum, which may be tens of centimeters or several meters from the true position depending on geometry and the initial guess. Always initialize EM with a coarse multilateration solution (from the decoupled scheme), and check the final log-likelihood against a small number of alternate random starts.

Historical Note: Joint Estimation: From Radar to Cellular

1968--2023

Joint parameter estimation has deep roots in radar: the matched filter of Van Trees (1968) and Kay (1993) simultaneously estimates target range and velocity from a single return pulse by maximizing a 2D ambiguity function. The same principle applied to bistatic and passive radar yielded joint position-velocity estimation in the 1980s.

The cellular community was slower to adopt joint processing — partly because early cellular systems had neither the bandwidth nor the array gain to make positioning meaningful. The turning point came in the 2010s with mmWave and massive MIMO: suddenly there was enough bandwidth for sub-meter ranging and enough spatial resolution for sub-degree AOA. By the mid-2020s, the joint estimation framework from radar had been fully imported into the cell-free massive MIMO literature (Liu and Caire 2023), and ISAC was declared a pillar of 6G.

,

Quick Check

Why is EM initialized with a coarse decoupled multilateration solution rather than with a random position?

Because the EM iteration has no likelihood-climbing guarantee.

Because the joint likelihood has many local maxima, and EM converges to the nearest one.

Because it halves the computational cost per EM iteration.

Because 3GPP specifications require a coarse estimate first.