RFCanvas: Vision-RF Fusion with Tensorial Fields

Bridging Vision and RF

A fundamental limitation of RF-3DGS (Section 26.2) is that it requires dense RF measurements --- hundreds of spatially distributed power samples. In many scenarios, we have abundant visual data (camera images, LiDAR point clouds) but only a handful of RF measurements. RFCanvas (Chen et al., 2024) exploits this asymmetry: it initialises a 3D Gaussian scene from visual data and then adapts it to predict RF propagation using as few as 10--20 RF measurements.

The key insight is that visual geometry strongly constrains RF propagation: walls block signals, reflective surfaces create multipath, and open spaces allow line-of-sight. RFCanvas encodes this prior knowledge through multi-modal initialisation and tensorial RF fields.

Definition:

Tensorial RF Field

A tensorial RF field represents the RF attributes of each Gaussian as a low-rank tensor decomposition. Instead of storing a single scalar power pkp_k, each Gaussian carries a compact tensor that encodes the directional and frequency-dependent RF response:

fkRF=βˆ‘r=1Rvk,r(1)βŠ—vk,r(2)βŠ—vk,r(3),\mathbf{f}_k^{\text{RF}} = \sum_{r=1}^R \mathbf{v}_{k,r}^{(1)} \otimes \mathbf{v}_{k,r}^{(2)} \otimes \mathbf{v}_{k,r}^{(3)},

where vk,r(1)∈RDθ\mathbf{v}_{k,r}^{(1)} \in \mathbb{R}^{D_\theta} encodes angular dependence (discretised azimuth/elevation), vk,r(2)∈RDf\mathbf{v}_{k,r}^{(2)} \in \mathbb{R}^{D_f} encodes frequency dependence, and vk,r(3)∈RDp\mathbf{v}_{k,r}^{(3)} \in \mathbb{R}^{D_p} encodes polarisation. The rank RR controls the trade-off between expressiveness and compactness.

The RF power at location \ntnrxpos\ntn{rx_pos} from direction d^\hat{\mathbf{d}} and frequency f0f_0 is obtained by evaluating the tensor at the appropriate indices and passing through the splatting equation.

Definition:

Spherical Harmonic Directional Power Model

RFCanvas models the directional power pattern of each Gaussian using spherical harmonics (SH):

P^k(d^)=βˆ‘β„“=0Lβˆ‘m=βˆ’β„“β„“ak,β„“,m Yβ„“m(d^),\hat{P}_k(\hat{\mathbf{d}}) = \sum_{\ell=0}^{L} \sum_{m=-\ell}^{\ell} a_{k,\ell,m} \, Y_\ell^m(\hat{\mathbf{d}}),

where Yβ„“mY_\ell^m are the real spherical harmonics of degree β„“\ell and order mm, and ak,β„“,ma_{k,\ell,m} are learnable coefficients. The maximum degree LL determines the angular resolution:

  • L=0L = 0: isotropic (omnidirectional scatterer),
  • L=2L = 2: captures main lobe and one sidelobe,
  • L=4L = 4: captures specular reflection patterns.

This is the same SH representation used for view-dependent colour in optical 3DGS, adapted to model the angular distribution of scattered RF power.

In practice, L=2L = 2 or L=3L = 3 suffices for most RF scenarios because RF scattering is less angularly complex than optical reflectance. Higher-order SH are useful only for highly directional scatterers like metallic plates.

,

Theorem: Sample Complexity Reduction from Visual Priors

Let G^vis\hat{\mathcal{G}}_{\text{vis}} be a Gaussian scene initialised from visual data (LiDAR + camera), and let G^rand\hat{\mathcal{G}}_{\text{rand}} be a randomly initialised scene. For a target power prediction accuracy E[∣P^dBβˆ’PdB∣]≀ϡ\mathbb{E}[|\hat{P}_{\text{dB}} - P_{\text{dB}}|] \leq \epsilon, the number of RF measurements required satisfies:

Mvis(Ο΅)≀dRFdtotalβ‹…Mrand(Ο΅),M_{\text{vis}}(\epsilon) \leq \frac{d_{\text{RF}}}{d_{\text{total}}} \cdot M_{\text{rand}}(\epsilon),

where dRFd_{\text{RF}} is the effective dimensionality of the RF-specific parameters (power, phase, SH coefficients) and dtotald_{\text{total}} is the total parameter dimensionality including geometry.

In typical scenarios where geometry accounts for >80%> 80\% of the parameters, this implies Mvis≀0.2β‹…MrandM_{\text{vis}} \leq 0.2 \cdot M_{\text{rand}} --- a 5Γ—5\times reduction in required RF measurements.

Visual data pins down the geometry (Gaussian positions and shapes). Only the RF-specific attributes (how much power each surface scatters) need to be learned from RF measurements. Since geometry accounts for most of the scene complexity, the RF measurements need only determine a much smaller parameter set.

Example: RFCanvas Pipeline for Indoor RF Mapping

Describe the full RFCanvas pipeline for reconstructing an indoor RF power map using LiDAR, camera, and 20 RF measurements.

Few-Shot RF Prediction Quality

Observe how the power prediction error decreases as the number of RF measurements increases, comparing random initialisation versus visual-prior initialisation (RFCanvas approach).

Parameters
20
2

Common Mistake: The Vision-to-RF Gap

Mistake:

Assuming that visual geometry directly translates to RF propagation without adaptation --- e.g., that optically reflective surfaces are also RF reflective.

Correction:

The correspondence between visual and RF properties is imperfect:

  • Glass is optically transparent but highly reflective at mmWave frequencies.
  • Plasterboard walls are opaque to cameras but partially transparent to sub-6 GHz signals.
  • Vegetation appears as dense visual structure but is nearly transparent to low-frequency RF.
  • Metallic surfaces are highly reflective in both domains (good correspondence).

RFCanvas handles this gap through the RF fine-tuning step: Gaussians at glass surfaces learn high RF opacity despite low visual opacity, and vice versa. Without this adaptation step, the visual prior alone can produce >15> 15 dB prediction errors.

πŸ”§Engineering Note

Multi-Modal Sensor Requirements

RFCanvas requires co-registered multi-modal data:

  1. Camera images (β‰₯50\geq 50 views for COLMAP): standard RGB, resolution β‰₯1280Γ—720\geq 1280 \times 720.
  2. LiDAR scans (optional but recommended): densifies the point cloud and provides metric depth. A single 360-degree scan with ∼105\sim 10^5 points often suffices.
  3. RF measurements (β‰₯10\geq 10--2020): received power with known Tx and Rx positions. Position accuracy ≀10\leq 10 cm (indoor) or ≀1\leq 1 m (outdoor).

The sensors need NOT be synchronised in time. The visual data can be collected once, and RF measurements can be added incrementally as they become available.

Practical Constraints
  • β€’

    Camera and LiDAR must be calibrated (extrinsic transformation known)

  • β€’

    RF measurement positions must be in the same coordinate frame as the visual reconstruction

Historical Note: Multi-Modal RF Environment Reconstruction

2007--2024

The idea of using visual data to aid RF propagation prediction has a long history. Early work by Degli-Esposti et al. (2007) used building geometry from geographic databases to initialise ray tracers. The METIS 5G project (2015) used LiDAR-derived 3D city models for sub-6 GHz propagation. With the advent of neural scene representations, the fusion became tighter: DeepRay (He et al., 2022) used NeRF-style representations initialised from images. RFCanvas (2024) represents the state of the art in multi-modal fusion, combining the geometric fidelity of 3DGS with the efficiency of few-shot RF adaptation.

Quick Check

In RFCanvas, why are spherical harmonics used for the directional power pattern of each Gaussian?

To compress the representation and reduce memory usage

To model the angular dependence of scattered RF power from each surface element

To enforce rotational invariance of the scene model

To enable frequency-domain processing of the RF signal

Tensorial RF Field

A compact representation of the directional and frequency-dependent RF response of a Gaussian scatterer using low-rank tensor decomposition. The tensor factors encode angular, frequency, and polarisation dimensions separately, enabling efficient storage and rendering.

Related: Radio Radiance Field, Splatting

Key Takeaway

RFCanvas demonstrates that visual data (camera + LiDAR) provides a powerful geometric prior for RF scene reconstruction, reducing the required RF measurements by 5Γ—5\times or more compared to RF-only methods. The key innovations are: (1) multi-modal initialisation from visual 3DGS, (2) tensorial RF fields with spherical harmonics for directional scattering, and (3) a two-stage training that freezes geometry during RF adaptation. The vision-to-RF gap remains a fundamental challenge: material properties that differ between optical and RF domains require careful handling.