Comparison and Open Questions

Choosing a Neural Scene Representation for RF

Chapters 24--26 have presented three families of neural scene representations: signed distance functions (SDF, Chapter 24), neural radiance fields (NeRF, Chapter 24), and 3D Gaussian Splatting (this chapter). Each has been adapted for RF applications. The question a practitioner faces is: which representation should I use for my RF imaging problem? The answer depends on the specific requirements --- speed, accuracy, data availability, interpretability, and scalability. This section provides a systematic comparison.

NeRF vs 3DGS vs SDF for RF Scene Reconstruction

PropertyNeRF/RF-NeRF3DGS/RF-3DGSSDF/GeRaF
RepresentationImplicit (MLP weights)Explicit (NN Gaussians)Implicit (MLP \to distance)
RenderingRay marching (200\sim 200 samples/ray)Rasterisation (tile-based)Sphere tracing + volumetric
Training time55 min--2424 h1515--3030 min3030 min--22 h
Rendering speed0.030.03--1010 FPS>100> 100 FPS11--1010 FPS
PSNR (optical)32\sim 32 dB33\sim 33 dB30\sim 30 dB
RF power MAE4\sim 4 dB4\sim 4 dB5\sim 5 dB
MemoryLow (MLP weights only)Medium (200\sim 200 B/Gaussian)Low (MLP weights only)
Thin structuresPoor (volume rendering blur)Good (oriented Gaussians)Good (surface-based)
Smooth mediaExcellent (continuous density)Adequate (many small Gaussians)Poor (surfaces only)
InterpretabilityLow (black-box MLP)High (each Gaussian is a scatterer)Medium (surface geometry)
EditabilityDifficultEasy (move/add/remove Gaussians)Moderate (deform surface)
Dynamic scenesRequires D-NeRF extensionsNatural (per-Gaussian velocity)Difficult
, ,

Definition:

Taxonomy of RF Scene Representations

We classify RF scene representations along two axes:

Axis 1 --- Implicit vs Explicit:

  • Implicit: The scene is encoded in neural network weights. Querying a point requires a forward pass through the network. Examples: NeRF, SDF networks, DeepSDF.
  • Explicit: The scene is represented by a set of primitives with directly interpretable parameters. Querying a point requires evaluating the primitives. Examples: 3DGS, voxel grids, point clouds.

Axis 2 --- Volumetric vs Surface-based:

  • Volumetric: The representation assigns attributes (density, colour, RF power) to every point in 3D space. Examples: NeRF, voxel grids, 3DGS.
  • Surface-based: The representation defines a 2D surface embedded in 3D space. Examples: SDF, meshes, surfels.

For RF imaging, the choice depends on the scattering model: specular-dominant scenarios (metallic environments, mmWave) favour surface-based representations, while diffuse-dominant scenarios (sub-6 GHz indoor, vegetation) favour volumetric representations.

,

Example: Choosing the Right Representation

For each of the following RF imaging scenarios, recommend the most appropriate neural scene representation and justify the choice.

(a) Real-time digital twin of a factory floor with moving robots. (b) High-resolution imaging of a metallic aircraft fuselage. (c) Indoor coverage prediction with 50 power measurements and 1000 camera images. (d) Large-scale outdoor urban propagation modelling.

, ,

Theorem: Representation Complexity for RF Scenes

Consider an RF scene in a bounded region \ntntgtregionR3\ntn{tgt_region} \subset \mathbb{R}^3 with SS distinct scatterers, each with angular complexity bounded by spherical harmonic order LL. The minimum number of parameters required for an ϵ\epsilon-accurate power prediction is:

Representation Parameters
Voxel grid O((diam(\ntntgtregion)/ϵspatial)3)O((\text{diam}(\ntn{tgt_region})/\epsilon_{\text{spatial}})^3)
NeRF (MLP) O(S(L+1)2)O(S \cdot (L+1)^2)
3DGS O(S(10+(L+1)2))O(S \cdot (10 + (L+1)^2))
SDF (MLP) O(S)O(S) for geometry +  O(S(L+1)2)+\;O(S \cdot (L+1)^2) for reflectance

The key insight is that 3DGS scales with the number of scatterers SS, not the volume resolution. For sparse scenes (few scatterers in a large volume), 3DGS is far more efficient than voxel grids. For dense scenes (many small scatterers), the voxel grid or NeRF may be more compact.

3DGS allocates parameters where the scene has structure (at the scatterers) and uses zero parameters in empty space. Voxel grids allocate parameters uniformly in space. NeRFs encode the scene compactly in network weights but require expensive per-point evaluation.

,

NeRF vs 3DGS vs SDF Performance Comparison

Compare the three representations across speed, accuracy, and compactness as a function of scene complexity (number of scatterers).

Parameters
100
30

Common Mistake: Overfitting with Adaptive Density Control

Mistake:

Allowing unrestricted Gaussian densification when RF measurements are sparse (M<100M < 100), leading to overfitting where Gaussians cluster around measurement locations.

Correction:

With sparse RF data, adaptive density control must be constrained:

  • Cap the total number of Gaussians: NmaxcMN_{\max} \leq c \cdot M where c5c \sim 5--1010. This prevents the model from allocating more degrees of freedom than the data can constrain.
  • Increase the pruning threshold: Use ϵα=0.05\epsilon_\alpha = 0.05 instead of 0.0050.005.
  • Add regularisation: Total-variation or Laplacian smoothness penalty on the power map to prevent spatially discontinuous predictions.
  • Use visual priors (RFCanvas approach) to constrain geometry when available.
,

Open Research Questions

Several fundamental questions remain open for Gaussian Splatting in RF imaging:

  1. Coherent vs incoherent rendering. Current RF-3DGS methods primarily predict power (incoherent). Extending to full complex channel prediction requires tracking phase through the splatting pipeline, which is complicated by the non-differentiability of phase wrapping.

  2. Dynamic environments. RF environments change as people move, doors open, and vehicles pass. Dynamic 3DGS (D-3DGS) exists for optics but has not been adapted for RF. The challenge is that RF changes are caused by objects at wavelength scale, not just geometric deformations.

  3. Scalability. Current methods work for single rooms or short road segments. Scaling to building-level or city-level requires hierarchical or tiled representations with efficient inter-tile communication.

  4. Theoretical guarantees. Unlike classical estimators (LASSO, OAMP) for which recovery guarantees exist (Chapters 14, 17), no analogous theory exists for Gaussian splatting reconstruction. When does the optimisation converge? How many measurements are sufficient?

  5. Integration with the forward model. Can the Gaussian representation be integrated directly into the unified forward model y=Ac+w\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w} of Chapter 7, replacing the voxel grid? What are the computational and statistical implications?

Historical Note: Timeline of Neural Scene Representations for RF

2020--2024

The application of neural scene representations to RF is a very recent development:

  • 2020: Mildenhall et al. introduce NeRF for novel view synthesis in computer vision.
  • 2022: Zhao et al. adapt NeRF for RF propagation prediction (NeRF2, later RF-NeRF), demonstrating that implicit neural representations can model radio environments.
  • 2023: Kerbl et al. introduce 3D Gaussian Splatting at SIGGRAPH, achieving real-time rendering quality comparable to NeRF.
  • 2024: Zhang et al. (RF-3DGS), Chen et al. (RFCanvas), and Niedermayr et al. (RadarSplat) simultaneously adapt 3DGS for RF power prediction, multi-modal RF mapping, and automotive radar.
  • 2024: Dong et al. (GSpaRC) add physics-based propagation to Gaussian splatting for automotive radar.

The field has moved from concept to multiple practical systems in under two years, driven by the demand for fast, accurate RF environment models for 5G/6G network planning and autonomous driving.

, , , ,

Quick Check

For an indoor mmWave (28 GHz) environment with mostly metallic furniture and glass partitions, which representation would you expect to give the best reconstruction quality?

NeRF (RF-NeRF) because it handles complex volumetric scattering

3DGS because it offers real-time rendering

SDF (GeRaF) because specular scattering at mmWave is surface-dominated

Voxel grid because it requires no training

⚠️Engineering Note

Deployment Considerations for Neural RF Representations

Deploying neural scene representations for RF in production systems requires addressing several practical concerns:

  1. Model update frequency. Indoor environments change on the scale of minutes (people, furniture). Outdoor urban environments change on the scale of hours (traffic, weather). The model must be retrained or fine-tuned at these timescales. 3DGS is advantageous here: adding/removing a few Gaussians is faster than retraining an entire NeRF.

  2. Computational resources. 3DGS rendering requires a GPU. Edge deployment on base stations or vehicles may need inference on embedded GPUs (Jetson, mobile SoCs). Reducing the number of Gaussians (via GSpaRC sparsity) is critical for edge deployment.

  3. Standardisation. There is no standard format for neural RF scene representations. Interoperability between tools (Sionna, WinProp, RF-3DGS) requires either standardised export formats or adapter layers. The Open3D and PLY formats used by optical 3DGS are a starting point.

Practical Constraints
  • GPU required for real-time rendering (NVIDIA RTX or equivalent)

  • Model retraining needed when environment changes significantly

  • No standardised format for RF scene interchange

,

Explicit Scene Representation

A scene representation where the scene parameters (positions, shapes, attributes) are stored directly and can be read, modified, and rendered without evaluating a neural network. 3D Gaussian Splatting is an explicit representation; NeRF and SDF networks are implicit representations.

Related: Splatting, Analysis Through Synthesis

Key Takeaway

The choice between NeRF, 3DGS, and SDF for RF imaging depends on the application: 3DGS excels in speed and interpretability, making it ideal for real-time digital twins and dynamic scenes; NeRF handles volumetric diffuse scattering well; SDF is best for specular-dominated mmWave environments. Fundamental open questions remain around coherent rendering, dynamic environments, scalability to large scenes, and theoretical reconstruction guarantees. The field is evolving rapidly, and hybrid approaches that combine strengths of multiple representations are a promising direction.