Simultaneous Localization and Mapping
Beyond Positioning: Mapping the Radio Environment
Classical positioning assumes a set of known anchors (base stations with known coordinates) and estimates the UE position relative to them. In many practical scenarios, however, anchor positions are uncertain, the number of available anchors is insufficient, or the environment itself is unknown. Moreover, rich multipath channels --- traditionally viewed as a nuisance for positioning --- actually carry information about the geometry of the propagation environment: each specular multipath component (MPC) can be associated with a reflecting or scattering surface.
Simultaneous Localisation and Mapping (SLAM) addresses both limitations simultaneously: it jointly estimates the UE trajectory and the positions of environmental features (reflectors, scatterers) that generate the observed multipath. Originally developed in robotics for visual/lidar SLAM, the radio-SLAM framework adapts these ideas to wireless channels, exploiting the geometric structure of multipath propagation.
The key conceptual shift is profound: multipath is not a bug but a feature. Each resolvable MPC acts as a "virtual anchor" whose position can be inferred and subsequently used for positioning, effectively turning the environment into a distributed antenna array.
Definition: Radio-SLAM (Simultaneous Localisation and Mapping)
Radio-SLAM (Simultaneous Localisation and Mapping)
Radio-SLAM is the problem of jointly estimating, from radio channel measurements over time, the following quantities:
-
User state at each time step : comprising position , speed , heading , and clock bias .
-
Map of the environment: a set of virtual anchors (VAs) or physical features at positions , . Each VA corresponds to the mirror image of a physical BS with respect to a reflecting surface.
The joint posterior distribution is:
where is the measurement vector (delay, angle, Doppler) extracted from the -th MPC at time .
The factored structure of this posterior naturally maps to a factor graph, enabling efficient approximate inference via belief propagation (BP) or particle-based methods.
The term "virtual anchor" arises from the mirror-image geometry of specular reflections. A signal reflecting off a flat wall from a BS at position appears to originate from the mirror image of with respect to the wall. The delay to VA equals the total reflected path length, so the VA acts exactly like a real anchor for ranging purposes.
Factor Graph Formulation of Radio-SLAM
The joint posterior of radio-SLAM has a natural factor graph representation that reveals the conditional independence structure and enables scalable inference.
The factor graph contains:
- Variable nodes for each user state and each VA position .
- Factor nodes connecting variables:
- Transition factors encode the motion model (e.g., constant-velocity with process noise).
- Measurement factors encode the likelihood of observing MPC at time given the user state and VA position.
- Data association factors handle the unknown correspondence between detected MPCs and VAs (a combinatorial challenge analogous to multi-target tracking).
Belief propagation on this graph iteratively passes messages between variable and factor nodes. For the user state, the messages from all associated VAs are combined to form a position belief. For each VA, messages from all time steps refine its position estimate. The algorithm naturally handles:
- Appearing/disappearing features (VAs that become visible or occluded as the UE moves)
- Unknown number of features (via a Poisson or negative-binomial prior on )
- Measurement origin uncertainty (which detected MPC corresponds to which VA)
The computational cost per time step scales as where is the number of particles used to represent the beliefs and is the number of detected MPCs.
Channel-SLAM: Every Multipath Component is a Virtual Anchor
Channel-SLAM (Gentner et al., 2016) takes the radio-SLAM idea to its logical extreme: it operates with a single physical base station and treats each resolvable MPC as an independent virtual anchor. The measurements from each MPC --- delay , possibly angle and Doppler shift --- are used to estimate both the UE trajectory and the VA positions.
The key insight is that as the UE moves, each MPC traces a characteristic trajectory in the delay-angle-Doppler space that depends on the geometry of the reflection. By tracking these trajectories over time, Channel-SLAM can:
- Resolve the VA positions from the temporal evolution of the MPC parameters (similar to how a moving observer can triangulate a static target)
- Use the resolved VAs as anchors for subsequent position estimation, progressively refining both the map and the trajectory
This approach is particularly powerful at mmWave and sub-THz frequencies, where the large bandwidth resolves individual MPCs in delay, and the large antenna arrays resolve them in angle. In rich indoor environments with 5--20 resolvable MPCs, Channel-SLAM can achieve sub-metre accuracy with a single BS --- a capability impossible with conventional single-BS methods.
Paradigm Shift: Multipath as Feature, Not Foe
The radio-SLAM perspective fundamentally inverts the traditional view of multipath in positioning:
| Traditional View | Radio-SLAM View |
|---|---|
| Multipath causes NLOS bias | Multipath provides virtual anchors |
| More MPCs worse positioning | More MPCs richer map better positioning |
| Mitigation: detect and remove NLOS | Exploitation: estimate and use MPCs |
| Requires many physical BSs | Can work with a single BS |
| Static map assumed known | Map is estimated jointly |
This paradigm shift is enabled by three technological trends: (1) wide bandwidth at mmWave/sub-THz resolves individual MPCs in delay; (2) massive arrays resolve MPCs in angle; (3) computational power makes real-time factor-graph inference feasible.
The theoretical foundation was laid by Witrisal et al. (2016) and Leitinger et al. (2019), who showed that the equivalent Fisher information from multipath can exceed that from direct paths alone --- multipath literally adds positioning information when properly exploited.
RSS Fingerprinting: Data-Driven Positioning
An alternative to model-based positioning is fingerprinting, which learns a mapping from radio measurements to positions using a database of labelled training samples. The approach has two phases:
Offline phase (training): A survey is conducted in which RSS values from all hearable BSs are measured at known positions on a grid spanning the area of interest. The collection forms the fingerprint database (or radio map), where is the RSS vector at position .
Online phase (positioning): The UE measures its current RSS vector and finds the closest match(es) in the database:
where is the set of nearest neighbours in RSS space.
Modern approaches replace the explicit database with a neural network (DNN, CNN, or transformer) that learns the mapping from training data, achieving 1--3 m accuracy in indoor WiFi environments. Channel state information (CSI) fingerprinting further improves accuracy by exploiting amplitude and phase across subcarriers, achieving sub-metre accuracy in controlled environments.
RSS Fingerprinting Positioning
Simulate RSS fingerprinting in a 2D area. The radio map is built by computing the RSS from BSs at each grid point using the log-distance path loss model with shadow fading. A test UE position is randomly chosen, and its position is estimated using -nearest-neighbour matching in RSS space. Observe how finer grid spacing improves accuracy (at the cost of survey effort), while larger shadow fading standard deviation degrades performance by making the fingerprints less location-specific. Adding more BSs increases the dimensionality of the RSS vector, improving the discriminability of fingerprints.
Parameters
Applications of Radio-SLAM
Radio-SLAM and multipath-exploiting positioning find applications across several domains:
Indoor mapping and navigation: Building floor plans can be reconstructed from the estimated VA positions (which correspond to walls, pillars, and other reflecting surfaces). This enables crowd-sourced mapping: as users move through a building with their 5G devices, the network progressively builds and refines a radio map without explicit surveying.
Autonomous vehicles: Vehicular radio-SLAM fuses 5G positioning with radar and lidar to maintain lane-level localisation in urban canyons where GNSS is degraded. The radio map provides a persistent prior that bridges GNSS outages (e.g., in tunnels).
Industrial IoT: Factory environments with dense metallic reflectors create rich multipath that Channel-SLAM can exploit for sub-metre tracking of AGVs, robots, and assets using a single gNB with a massive array.
Emergency services: First responders entering unknown buildings can be localised and their environment mapped simultaneously, providing situational awareness to incident commanders.
The common theme is that radio-SLAM transforms the propagation environment from a passive, unknown entity into an active participant in the positioning solution.
Channel-SLAM: Virtual Anchors from Multipath
Key Takeaway
Multipath is a feature, not a foe, for positioning. Classical methods detect-and-exclude NLOS paths; radio-SLAM estimates-and-exploits them as virtual anchors. Each resolvable MPC adds positioning information (Fisher information), and the total information from multipath can exceed that from direct paths alone. This paradigm shift, enabled by wideband signals and massive arrays at mmWave/sub-THz, allows Channel-SLAM to achieve sub-metre accuracy with a single base station.
Why This Matters: Connections to RF Imaging and Statistical Inference
Radio-SLAM shares deep connections with topics in two specialised books:
-
RF Imaging book (Chs. 5--10): The virtual-anchor concept in radio-SLAM is the positioning analogue of reflectivity estimation in RF imaging. Both exploit multipath geometry to reconstruct spatial information. The sensing matrix in imaging maps directly to the measurement Jacobian in positioning. RF imaging extends to 2D/3D scene reconstruction; SLAM focuses on the 1D (range) or 2D (range+angle) structure.
-
FSI book (Chs. 4--8): The belief-propagation inference engine in radio-SLAM is a direct application of factor graph methods from statistical inference. The SLAM factor graph has the same structure as a hidden Markov model (HMM) with unknown parameters, and the BP algorithm generalises Kalman filtering to non-Gaussian, multi-modal posteriors.
See full treatment in RF Imaging and Sparse Recovery
Simultaneous Localisation and Mapping (SLAM)
The problem of jointly estimating a mobile agent's trajectory and the map of its environment from sequential measurements. Radio-SLAM adapts this robotics concept to wireless channels, using multipath components as environmental features (virtual anchors) to be mapped.
Related: Non-Line-of-Sight (NLOS), Position Error Bound (PEB)
Positioning Reference Signal (PRS)
A downlink reference signal in 5G NR specifically designed for high-accuracy timing measurements. PRS uses a comb-staggered frequency-domain structure to achieve full-bandwidth ranging while allowing frequency-domain multiplexing between gNBs.
Related: Time-of-Arrival (TOA), Time-Difference-of-Arrival (TDOA)
Quick Check
In Channel-SLAM, a UE communicates with a single BS in an indoor environment with four resolvable specular multipath components. How many virtual anchors are available for positioning, and what minimum number of time steps is needed for the SLAM filter to begin resolving the VA positions (assuming a 2D scenario with delay-only measurements)?
4 VAs; 1 time step suffices since there are 4 range circles
4 VAs; at least 2 time steps are needed so the UE displacement creates a baseline for triangulation
1 VA (only the LOS path counts); 4 time steps needed
4 VAs; at least 4 time steps are needed (one per VA)
Each MPC provides one virtual anchor, giving 4 VAs in addition to the physical BS (5 anchors total, but VA positions are unknown). At a single time step, each VA contributes one range measurement but has 2 unknown coordinates, so the system is under-determined. After at least 2 time steps with UE motion, each VA has been observed from 2 different UE positions, providing 2 range equations for 2 VA unknowns. The motion of the UE creates a synthetic aperture that enables triangulation of the VAs. In practice, 3--5 time steps are needed for reliable convergence due to noise and data association uncertainty.