RFCanvas: Vision-RF Fusion with Tensorial Fields
Bridging Vision and RF
A fundamental limitation of RF-3DGS (Section 26.2) is that it requires dense RF measurements --- hundreds of spatially distributed power samples. In many scenarios, we have abundant visual data (camera images, LiDAR point clouds) but only a handful of RF measurements. RFCanvas (Chen et al., 2024) exploits this asymmetry: it initialises a 3D Gaussian scene from visual data and then adapts it to predict RF propagation using as few as 10--20 RF measurements.
The key insight is that visual geometry strongly constrains RF propagation: walls block signals, reflective surfaces create multipath, and open spaces allow line-of-sight. RFCanvas encodes this prior knowledge through multi-modal initialisation and tensorial RF fields.
Definition: Tensorial RF Field
Tensorial RF Field
A tensorial RF field represents the RF attributes of each Gaussian as a low-rank tensor decomposition. Instead of storing a single scalar power , each Gaussian carries a compact tensor that encodes the directional and frequency-dependent RF response:
where encodes angular dependence (discretised azimuth/elevation), encodes frequency dependence, and encodes polarisation. The rank controls the trade-off between expressiveness and compactness.
The RF power at location from direction and frequency is obtained by evaluating the tensor at the appropriate indices and passing through the splatting equation.
Definition: Spherical Harmonic Directional Power Model
Spherical Harmonic Directional Power Model
RFCanvas models the directional power pattern of each Gaussian using spherical harmonics (SH):
where are the real spherical harmonics of degree and order , and are learnable coefficients. The maximum degree determines the angular resolution:
- : isotropic (omnidirectional scatterer),
- : captures main lobe and one sidelobe,
- : captures specular reflection patterns.
This is the same SH representation used for view-dependent colour in optical 3DGS, adapted to model the angular distribution of scattered RF power.
In practice, or suffices for most RF scenarios because RF scattering is less angularly complex than optical reflectance. Higher-order SH are useful only for highly directional scatterers like metallic plates.
Theorem: Sample Complexity Reduction from Visual Priors
Let be a Gaussian scene initialised from visual data (LiDAR + camera), and let be a randomly initialised scene. For a target power prediction accuracy , the number of RF measurements required satisfies:
where is the effective dimensionality of the RF-specific parameters (power, phase, SH coefficients) and is the total parameter dimensionality including geometry.
In typical scenarios where geometry accounts for of the parameters, this implies --- a reduction in required RF measurements.
Visual data pins down the geometry (Gaussian positions and shapes). Only the RF-specific attributes (how much power each surface scatters) need to be learned from RF measurements. Since geometry accounts for most of the scene complexity, the RF measurements need only determine a much smaller parameter set.
Parameter decomposition
Decompose the full parameter set as where and . With visual initialisation, is fixed (or fine-tuned with small learning rate).
Effective dimensionality
The RF-specific parameters per Gaussian are: 1 opacity + 1 power + SH coefficients 10--20 scalars (for ). The geometric parameters are: 3 position + 4 quaternion + 3 scale = 10 scalars. So -- per Gaussian.
Counting constraint
But the geometric parameters have much higher effective dimensionality because small positional errors propagate through the rendering equation quadratically. Accounting for this sensitivity, the visual prior reduces the sample complexity by a factor proportional to .
Example: RFCanvas Pipeline for Indoor RF Mapping
Describe the full RFCanvas pipeline for reconstructing an indoor RF power map using LiDAR, camera, and 20 RF measurements.
Step 1 --- Visual scene reconstruction
Run COLMAP on the camera images to obtain camera poses and a sparse 3D point cloud. Fuse with LiDAR depth maps to densify the point cloud. Train an initial optical 3DGS from the images (standard pipeline, min).
Step 2 --- RF attribute initialisation
For each optimised visual Gaussian, initialise RF attributes:
- Power : estimate from material classification (LiDAR intensity + camera texture). Concrete walls get high reflectivity, glass moderate, open air near-zero.
- SH coefficients for (start isotropic).
- Opacity : inherit from visual opacity.
Step 3 --- RF fine-tuning
Freeze geometric parameters . Optimise only to minimise the dB-scale power prediction loss on the 20 RF measurements:
After iterations ( min), the model predicts power at novel locations with dB MAE.
Step 4 --- Optional geometric refinement
Unfreeze with a smaller learning rate and continue training for iterations. This fine-tunes the geometry to account for RF-invisible structures (e.g., glass walls visible to camera but transparent to RF) or RF-opaque structures invisible to camera (e.g., metallic ducts behind plasterboard).
Few-Shot RF Prediction Quality
Observe how the power prediction error decreases as the number of RF measurements increases, comparing random initialisation versus visual-prior initialisation (RFCanvas approach).
Parameters
Common Mistake: The Vision-to-RF Gap
Mistake:
Assuming that visual geometry directly translates to RF propagation without adaptation --- e.g., that optically reflective surfaces are also RF reflective.
Correction:
The correspondence between visual and RF properties is imperfect:
- Glass is optically transparent but highly reflective at mmWave frequencies.
- Plasterboard walls are opaque to cameras but partially transparent to sub-6 GHz signals.
- Vegetation appears as dense visual structure but is nearly transparent to low-frequency RF.
- Metallic surfaces are highly reflective in both domains (good correspondence).
RFCanvas handles this gap through the RF fine-tuning step: Gaussians at glass surfaces learn high RF opacity despite low visual opacity, and vice versa. Without this adaptation step, the visual prior alone can produce dB prediction errors.
Multi-Modal Sensor Requirements
RFCanvas requires co-registered multi-modal data:
- Camera images ( views for COLMAP): standard RGB, resolution .
- LiDAR scans (optional but recommended): densifies the point cloud and provides metric depth. A single 360-degree scan with points often suffices.
- RF measurements (--): received power with known Tx and Rx positions. Position accuracy cm (indoor) or m (outdoor).
The sensors need NOT be synchronised in time. The visual data can be collected once, and RF measurements can be added incrementally as they become available.
- β’
Camera and LiDAR must be calibrated (extrinsic transformation known)
- β’
RF measurement positions must be in the same coordinate frame as the visual reconstruction
Historical Note: Multi-Modal RF Environment Reconstruction
2007--2024The idea of using visual data to aid RF propagation prediction has a long history. Early work by Degli-Esposti et al. (2007) used building geometry from geographic databases to initialise ray tracers. The METIS 5G project (2015) used LiDAR-derived 3D city models for sub-6 GHz propagation. With the advent of neural scene representations, the fusion became tighter: DeepRay (He et al., 2022) used NeRF-style representations initialised from images. RFCanvas (2024) represents the state of the art in multi-modal fusion, combining the geometric fidelity of 3DGS with the efficiency of few-shot RF adaptation.
Quick Check
In RFCanvas, why are spherical harmonics used for the directional power pattern of each Gaussian?
To compress the representation and reduce memory usage
To model the angular dependence of scattered RF power from each surface element
To enforce rotational invariance of the scene model
To enable frequency-domain processing of the RF signal
RF scattering from surfaces is directionally dependent: a smooth wall reflects specularly while a rough surface scatters more diffusely. SH provide a smooth, differentiable basis for representing these angular patterns, with the order controlling the angular resolution. This is the same reason optical 3DGS uses SH for view-dependent colour.
Tensorial RF Field
A compact representation of the directional and frequency-dependent RF response of a Gaussian scatterer using low-rank tensor decomposition. The tensor factors encode angular, frequency, and polarisation dimensions separately, enabling efficient storage and rendering.
Related: Radio Radiance Field, Splatting
Key Takeaway
RFCanvas demonstrates that visual data (camera + LiDAR) provides a powerful geometric prior for RF scene reconstruction, reducing the required RF measurements by or more compared to RF-only methods. The key innovations are: (1) multi-modal initialisation from visual 3DGS, (2) tensorial RF fields with spherical harmonics for directional scattering, and (3) a two-stage training that freezes geometry during RF adaptation. The vision-to-RF gap remains a fundamental challenge: material properties that differ between optical and RF domains require careful handling.