Joint Detection and Positioning
One Waveform, Two Estimates
The previous section treated positioning as a problem layered on top of communication: estimate the channel, then estimate the TOA, then estimate the position. This ordering leaves information on the table. The data symbol sequence is itself modulated by the channel and delayed by , so the received samples carry information about and simultaneously. Running two independent decoders (one for data, one for position) is the statistical equivalent of measuring the same thing twice and throwing away one measurement.
Joint detection and positioning exploits the coupling: a position estimate pins down the phase/amplitude structure of the composite channel, which tightens data detection; meanwhile, soft data decisions refine the timing estimate, which tightens positioning. The CommIT group has shown that in cell-free deployments this joint processing closes a 3-5 dB gap to the information-theoretic limit at moderate SNR.
Definition: Joint Likelihood of Data and Position
Joint Likelihood of Data and Position
Consider a user at position transmitting a block of data symbols via a known pulse :
The baseband observation at AP is
Stacking the sampled observations across APs into , the joint log-likelihood is
where , is the geometry-induced AOA, and absorbs the path-loss.
The appearance of inside , , and simultaneously is what makes joint detection and positioning a nontrivial optimization: the likelihood is nonconvex in (the delay parameter enters via a highly oscillatory sinc function) and bilinear in and the effective channel.
Definition: Decoupled (Iterative) Detection-Positioning
Decoupled (Iterative) Detection-Positioning
A decoupled scheme alternates two independent subproblems:
-
Channel/data estimation step. Fix a current position estimate . Compute the predicted delays, angles, and path losses. Use the standard cell-free MMSE detector to extract the data symbols and channel amplitudes .
-
Position estimation step. Fix the data . Treat the detected symbols as a known pilot and estimate the delays by matched filtering. Run multilateration on to get .
Iterate until convergence. This scheme is simple, modular (both subproblems are standard cell-free operations), and inherits the convergence guarantees of block coordinate descent under mild conditions. Its weakness is that it does not exchange soft information between the two subproblems β hard data decisions propagate errors into the TOA estimator at low SNR.
Definition: Joint (EM) Detection-Positioning
Joint (EM) Detection-Positioning
A joint scheme based on expectation-maximization (EM) updates while marginalizing over the data:
E-step. Compute soft posterior probabilities of each data symbol using a cell-free MMSE detector with channel parameters evaluated at .
M-step. Maximize the expected log-likelihood
which, under Gaussian symbols or soft QAM, reduces to a weighted nonlinear least-squares fit of the predicted timings against the soft-demodulated signal.
Unlike the decoupled scheme, EM propagates soft data posteriors into the position estimator, which dramatically improves behavior near and below the communication decoding threshold.
EM converges monotonically in the likelihood and, by the EM-MAP equivalence, to a local maximum of the joint likelihood. For cell-free systems the M-step is nonconvex in , so multiple restarts or a coarse grid initialization (from the decoupled scheme's output) are standard.
Theorem: Joint Scheme Outperforms Decoupled Scheme
Let and denote the position mean-squared error achieved by the joint (EM) scheme and the decoupled iterative scheme, respectively. Then for any SNR and any deployment,
with strict inequality whenever the symbol-error probability at the communication SNR operating point is nonzero. In particular, as both schemes converge to the joint CRB, but at practical SNRs (0-15 dB) the joint scheme can be 3-6 dB tighter.
The decoupled scheme commits to hard symbol decisions before using them as a pilot. Any symbol error feeds the TOA estimator with a wrong "pilot," producing a biased position estimate. The joint scheme defers symbol commitment and instead weights all possible symbol hypotheses by their likelihood β this is a fundamental information- preserving operation that never makes things worse.
Data processing inequality for EM
The joint log-likelihood has a marginalized version . EM produces a sequence whose likelihood monotonically increases toward the marginal likelihood. The decoupled scheme maximizes a plug-in likelihood with replaced by its hard estimate.
Jensen's inequality
For any distribution , . Setting to a delta at the decoupled symbol estimate shows that the joint marginal likelihood is at least as large as the decoupled plug-in likelihood.
Asymptotic CRB match
At high SNR, symbol error goes to zero and both schemes coincide with the CRB. Both schemes are therefore asymptotically efficient; the joint scheme is strictly better at finite SNR.
EM for Joint Detection and Positioning
Complexity: per userThe M-step is the most expensive substep: it is a 2D nonlinear least-squares problem that can be solved by Gauss-Newton starting from . Warm-starting drives convergence in 2-3 Gauss-Newton iterations.
Joint vs. Decoupled: Position RMSE vs SNR
Compare the position root-mean-square error of the joint (EM-based) and decoupled (alternating) schemes as a function of receive SNR. The decoupled scheme shows a high-error plateau at low SNR, caused by symbol errors feeding back into the timing estimator. The joint scheme degrades smoothly and tracks the CRB over a wider SNR range. At high SNR both converge to the bound.
Parameters
Example: Two-AP Toy Example: When Does Joint Help?
Consider two APs and a single-symbol QPSK transmission. At SNR 0 dB the symbol error probability is roughly . If the decoupled scheme uses the hard-detected symbol for ranging, the TOA estimator sees a 15% chance of a completely wrong reference. Quantify how much this inflates the position MSE over the joint scheme.
Decoupled MSE
Conditioned on a correct symbol (prob ), the range MSE equals the CRB. Conditioned on a wrong symbol (prob ), the range estimate acquires a phase ambiguity equivalent to the symbol period . Over QPSK, a wrong symbol is often 90 degrees off, producing a bias roughly equal to at typical cellular frequencies β easily 3-5 meters.
Joint MSE
The joint scheme weights each of the 4 QPSK hypotheses by its likelihood and sums. Because the true symbol has the highest likelihood by construction, its contribution dominates, and the effective MSE is close to the correct-symbol MSE. Specifically, , essentially the CRB.
Gain
At , the decoupled MSE is roughly , dominated by the bias term. The joint gain can be 10-20 dB in position MSE at the same communication SNR.
Why This Matters: From Joint Processing to ISAC as a System Paradigm
Joint detection and positioning is a microcosm of the broader Integrated Sensing and Communication (ISAC) philosophy: the same waveform, hardware, and spectrum must deliver both data and measurement simultaneously. For individual users, joint detection-positioning closes a 3-5 dB gap. For the system, ISAC reallocates the degrees of freedom between sensing and communication based on traffic demand β an adaptive tradeoff explored in Β§CRB and the Rate-PEB Tradeoff. For the network operator, it means a single cell-free infrastructure serves connectivity, positioning, and environment sensing without separate deployments.
Common Mistake: EM Can Converge to the Wrong Position
Mistake:
EM is advertised as a monotonic likelihood climber, so one assumes a single run is enough to recover the user position.
Correction:
The joint likelihood is nonconvex in : the delay-phase oscillations from the factor create many local maxima spaced at multiples of the carrier period. EM converges to the nearest local maximum, which may be tens of centimeters or several meters from the true position depending on geometry and the initial guess. Always initialize EM with a coarse multilateration solution (from the decoupled scheme), and check the final log-likelihood against a small number of alternate random starts.
Historical Note: Joint Estimation: From Radar to Cellular
1968--2023Joint parameter estimation has deep roots in radar: the matched filter of Van Trees (1968) and Kay (1993) simultaneously estimates target range and velocity from a single return pulse by maximizing a 2D ambiguity function. The same principle applied to bistatic and passive radar yielded joint position-velocity estimation in the 1980s.
The cellular community was slower to adopt joint processing β partly because early cellular systems had neither the bandwidth nor the array gain to make positioning meaningful. The turning point came in the 2010s with mmWave and massive MIMO: suddenly there was enough bandwidth for sub-meter ranging and enough spatial resolution for sub-degree AOA. By the mid-2020s, the joint estimation framework from radar had been fully imported into the cell-free massive MIMO literature (Liu and Caire 2023), and ISAC was declared a pillar of 6G.
Quick Check
Why is EM initialized with a coarse decoupled multilateration solution rather than with a random position?
Because the EM iteration has no likelihood-climbing guarantee.
Because the joint likelihood has many local maxima, and EM converges to the nearest one.
Because it halves the computational cost per EM iteration.
Because 3GPP specifications require a coarse estimate first.
Correct. The oscillatory carrier-induced term creates a landscape of local maxima spaced by the carrier period. A coarse, low-variance multilateration initialization places EM in the basin of attraction of the global maximum.