Joint VR and Channel Estimation
Why Estimate Both at Once
Sections 18.1β18.4 have given us three separate components: a VR mask with a 2D Markov prior (18.2), a subarray processing pipeline (18.3), and a near-field sparse channel representation (18.4). Running them sequentially β first detect the VR, then estimate the channel β is tempting but wastes information. A sequential scheme uses the raw pilot correlator for VR detection and throws away the channel coefficients; a proper scheme feeds the channel estimate back into the VR detector because a large coherent signal on an antenna is stronger evidence of than a raw energy test. The principled machinery for this feedback is the EM algorithm with the 2D Markov prior playing the role of a structured latent variable and the near-field dictionary supplying the observation model.
Definition: Joint Posterior of Mask and Channel
Joint Posterior of Mask and Channel
Stack all pilot observations into . For user , let denote the latent binary mask and the near-field channel in polar coordinates. Under the Gaussian observation model and with the 2D Markov prior on (Definition D2D Markov Random Field Prior on the VR Mask), the joint posterior factors as The Gaussian likelihood is where applies the mask element-wise.
The log-joint is a sum of a discrete-MRF term in , a continuous quadratic term in , and a bilinear coupling (quadratic in ). Maximizing jointly over is NP-hard in general, but alternating maximization (EM) converges to a good local optimum in a few iterations.
Theorem: Monotone Ascent of EM for the Joint Problem
Let denote the variational distribution over at EM iteration and the channel estimate. Define the evidence lower bound The EM updates β E-step: subject to the mean-field factorization , and M-step: β produce a sequence that is monotonically non-decreasing.
Each step maximizes the ELBO with respect to a different argument, so the ELBO cannot decrease. Because the ELBO is bounded above by the log-evidence , the sequence converges. The limit is a stationary point β not necessarily the global maximum β but in practice the 2D Markov prior regularizes the landscape well and good initializations reach high-quality solutions within 3β5 iterations.
E-step non-decrease
By definition , so .
M-step non-decrease
Similarly , so .
Chain and bound
Chaining the two steps: . Because , the monotone sequence converges.
Joint VR + Channel Estimation via EM
Complexity: . For EM outer iterations, BP sweeps, , and , total flops. Well within the subarray complexity budget of Section 18.3.Step 5 is the core CommIT contribution: loopy BP on the 2D Ising graph enforces spatial smoothness of the VR and acts as a structured regularizer on the mask. Without it, the mean-field update would amount to independent sigmoid thresholding per antenna β the same as sequential VR detection, and strictly worse than the joint scheme (Theorem TMonotone Ascent of EM for the Joint Problem still holds but the operating point is worse).
Joint EM vs Sequential Estimation: NMSE vs VR Mismatch
Compare four estimators across the operating SNR: (i) genie (true VR + MMSE), (ii) sequential (hard-threshold VR detector + MMSE on detected support), (iii) XuβCaire joint EM (this section), (iv) LS on the full aperture. At moderate SNR the joint EM lies within 1 dB of the genie; the sequential detector suffers a 4β6 dB penalty near the boundary of VR ambiguity.
Parameters
Example: How Many EM Iterations Do We Need?
On a panel with and dB, how does the NMSE of the joint EM estimator evolve with EM iteration index and what is a reasonable stopping rule?
Iteration 0 (polar-OMP init)
At the mask is essentially uniform , and the M-step is ordinary polar-OMP with a very loose prior. Typical NMSE (much worse than LS on the true VR).
Iteration 1 (first BP pass)
The BP sweeps clean the mask using the channel estimate from iteration 0. NMSE drops sharply to , the bulk of the improvement.
Iterations 2β3
The mask and channel now mutually reinforce; NMSE falls to , essentially the genie floor for this operating point.
Iterations 4β5
Diminishing returns; NMSE improvement below . The paper recommends stopping when , which usually triggers at or .
Pilot Overhead in the XL-MIMO Regime
A subtle payoff of the joint estimator is that it shrinks the pilot overhead needed to achieve a target NMSE. With the Markov prior exploited, pilot lengths as short as symbols suffice on a panel at dB SNR β less than the suggested by orthogonal pilot allocation. The remaining non-orthogonality is absorbed by the MMSE / sparse step, which the 2D prior makes robust. In short: the 2D Markov prior partially substitutes for pilot orthogonality, a surprisingly aggressive form of pilot decontamination specific to XL-MIMO.
Why This Matters: Connection to Classical Pilot Contamination (Ch. 3)
Pilot contamination (Chapter 3) arises when two users in different cells share the same pilot; the estimator cannot separate them because their covariance subspaces overlap. In XL-MIMO, users in the same cell share the pilot resource too, but the VR structure hands the estimator an extra separator: two users with disjoint VRs can share the same pilot sequence because the spatial evidence on different antennas distinguishes them. The joint EM of this section exploits that spatial separation automatically; the result is a form of spatial pilot decontamination that only works when visibility regions are non-overlapping enough. The CommIT group's work on spatially correlated pilot decontamination from Chapter 3 is the sibling story in the stationary regime β same principle, different structural prior.
Common Mistake: Do Not Freeze the Mask Too Early
Mistake:
After the first BP pass, the marginals look well-separated, so round them to a hard 0/1 mask and finish with an ordinary LS on the hard support.
Correction:
Hard-thresholding between EM iterations destroys the soft evidence that the M-step needs to refine the channel. The penalty is largest at the boundary of the VR where sits around 0.3β0.7, precisely the antennas where the joint information flow matters most. Keep the marginals soft throughout EM and hard-threshold only at the very end, and only if your downstream combiner requires a binary mask. The interactive plot above shows the 2β4 dB NMSE penalty of premature thresholding at moderate SNR.
Deploying the Joint Estimator in Real Time
A production XL-MIMO deployment running the joint EM estimator of Algorithm AJoint VR + Channel Estimation via EM must balance three constraints: coherence block length (typically β symbols), per-user baseband budget (typically ms on embedded DSP), and fronthaul capacity. A practical blueprint:
- Outer loop β. More iterations rarely improve NMSE by more than dB.
- Inner BP sweeps. Use a checkerboard schedule so the sweeps parallelize across half the grid at a time.
- Polar dictionary cached. The dictionary depends only on array geometry and wavelength, not on channel or user state, so it can be precomputed once and reused for months.
- Subarray fallback. If the per-user budget is tight, run the joint EM on the active subarrays only (Section 18.3), which drops the M-step cost by without hurting NMSE noticeably.
- Graceful degradation. When SNR drops below dB, fall back to polar- OMP with a flat mask prior; the MRF benefit vanishes under very noisy evidence and the BP sweeps waste cycles.
- β’
Outer EM iterations
- β’
Inner BP sweeps
- β’
Per-user runtime budget: ms on embedded DSP for
- β’
Fallback threshold: SNR dB reverts to polar-OMP with uniform mask