GeRaF: Geometry from mmWave Radar
From NeRF to GeRaF: The Lensless Imaging Challenge
NeRF (Chapter 24) reconstructs 3D scenes from posed camera images by differentiating through volumetric rendering. The key insight is that each pixel in a camera image corresponds to a single ray through the scene --- the camera lens provides a one-to-one mapping from directions to pixels.
Radar has no lens. Each radar measurement integrates contributions from all scatterers within the antenna beam, weighted by the round-trip propagation delay and beamforming gain. This lensless imaging model means that NeRF's ray-based sampling is not directly applicable. GeRaF (Lu et al., 2024) resolves this by working with matched-filter (MF) power images instead of raw measurements: the MF concentrates energy at the correct spatial locations, producing a pseudo-image that can be rendered differentiably through the neural SDF.
Definition: Matched-Filter Power Image
Matched-Filter Power Image
Given radar measurements and the sensing matrix , the matched-filter (MF) image is (cf. Chapter 13):
where normalises the columns. The MF power image is the squared magnitude:
The MF power image concentrates energy at scatterer locations but has limited resolution (governed by the point spread function) and contains sidelobes. GeRaF treats as the "observation" that the neural SDF must explain.
Definition: GeRaF Scene Model
GeRaF Scene Model
GeRaF represents the scene with two neural networks:
-
Geometry network (neural SDF): , with the surface at .
-
Reflectivity network: , predicting the surface reflectivity at each point.
Both networks share a positional encoding and are trained jointly. The SDF determines where scattering occurs (the surface); the reflectivity network determines how strongly the surface scatters.
Definition: RF Volumetric Rendering for Lensless Imaging
RF Volumetric Rendering for Lensless Imaging
GeRaF renders the MF power image at voxel by integrating along the line from each Tx--Rx pair through . The predicted MF power is:
where:
- encodes the beamforming gain and propagation loss for the Tx--Rx pair at voxel ,
- is a smooth approximation to the Dirac delta, concentrating the scattering contribution near the surface .
The parameter controls the "surface thickness": as , scattering concentrates on the zero level set.
The key difference from NeRF: in NeRF, each camera pixel corresponds to one ray, and density is integrated along that ray. In GeRaF, each "pixel" in the MF power image already aggregates contributions from all Tx--Rx pairs, and the rendering maps the neural SDF to this aggregated quantity.
Theorem: GeRaF Training Objective
The GeRaF training loss is:
where the Eikonal regularizer is
with a set of random 3D sample points.
The first term enforces data fidelity: the rendered power must match the observed MF power. The second term enforces geometric validity: the neural network must output a valid SDF.
Without the Eikonal regulariser, the network could satisfy the data term by producing an arbitrary scalar field that happens to match the power image but has no geometric meaning. The Eikonal loss constrains the output to be a distance function, ensuring that the zero level set defines a meaningful surface.
Data fidelity term
The L2 loss between observed and rendered MF power is the standard regression loss. Since both and are non-negative, this term is well-defined and differentiable with respect to via the chain rule through the rendering equation and the neural networks.
Eikonal regulariser
The Eikonal term penalises deviations of from . The gradient is computed by automatic differentiation, making the regulariser fully differentiable. The sample points are drawn uniformly in the scene volume plus concentrated near the current zero level set (importance sampling).
End-to-end training
Both terms are differentiable with respect to . Training proceeds by gradient descent (Adam), alternating between updating the geometry network (SDF) and the reflectivity network, or updating both simultaneously.
GeRaF Pipeline: MF Power to SDF Reconstruction
Visualise the GeRaF reconstruction pipeline. The left panel shows the input MF power image; the middle panel shows the evolving neural SDF during training; the right panel shows the extracted zero level set. Adjust the number of training iterations and the Eikonal regularisation weight.
Parameters
MF Power Image Resolution vs. SDF Reconstruction
Compare the MF power image (limited by the PSF resolution) with the SDF-based surface reconstruction. Observe how the neural SDF recovers sharper geometry than the MF image alone by leveraging the Eikonal constraint and the surface prior.
Parameters
Example: GeRaF Reconstruction of a Spherical Scatterer
A spherical scatterer of radius m is located at the origin. A mmWave MIMO array at distance m operates at GHz with GHz. Compute the MF power image and show how the GeRaF loss drives the neural SDF toward the true sphere SDF.
MF power image
The matched filter concentrates energy at the sphere's location, producing a peak at the origin with width (cross-range) and (range), where is the array aperture. For a UPA with spacing: mm mm, giving cross-range resolution mm m mm m.
The MF power image is a blurred version of the true reflectivity, with resolution far coarser than the object size.
Neural SDF initialisation
The SDF network is initialised to approximate a large sphere centred at the origin: with (following geometric initialisation from Atzmon and Lipman, 2020). This ensures the initial zero level set encloses the scene.
Training convergence
After iterations of Adam with learning rate :
- The data fidelity loss decreases as the rendered MF power matches the observed MF power.
- The Eikonal loss stays near zero, maintaining SDF validity.
- The zero level set contracts from the large initial sphere toward the true sphere of radius m.
- The reflectivity network converges to a uniform value matching the sphere's radar cross-section.
GeRaF: Geometry Reconstruction from mmWave Radar
Lu and Caire introduced GeRaF, the first method to reconstruct 3D geometry from mmWave radar using neural signed distance functions. The key insight is that matched-filter power images --- readily computable from standard radar measurements --- serve as a differentiable "pseudo-image" that bridges the gap between lensless radar and lens-based camera imaging (NeRF).
GeRaF jointly estimates geometry (via a neural SDF) and surface reflectivity from multi-view radar data, demonstrating that sub-wavelength surface recovery is possible when the Eikonal constraint is enforced. This work establishes the SDF representation as the foundation for neural RF scene reconstruction, extended in Chapters 26 (Gaussian splatting) and later chapters.
Computational Cost of GeRaF Training
GeRaF training involves: (1) evaluating the neural SDF at sampled 3D points, (2) computing the rendered MF power via the lensless rendering equation, (3) computing the Eikonal loss via automatic differentiation of the SDF gradient, and (4) backpropagating through all three.
For a scene discretised into voxels, each training iteration requires forward passes through the MLP plus gradient computations for the Eikonal term. On a single GPU (NVIDIA A100), training typically converges in minutes for simple scenes and minutes for complex indoor environments.
- β’
GPU memory: the Eikonal gradient computation requires storing intermediate activations
- β’
Batch size: limited by GPU memory; typical batch = 4096 points
- β’
MLP depth: deeper networks (8 layers) improve quality but slow training
Common Mistake: Omitting Eikonal Regularization in GeRaF
Mistake:
Training the GeRaF model without the Eikonal regulariser (), expecting the data fidelity term alone to produce a valid SDF.
Correction:
Without Eikonal regularisation, the geometry network produces an arbitrary scalar field that fits the MF power but has no distance-function structure. The zero level set becomes noisy and disconnected, with self-intersecting surfaces. Even a small regularisation weight () dramatically improves surface quality.
Why This Matters: Matched-Filter Power Images in ISAC Systems
In an ISAC system (Chapter 29), the communication waveform doubles as a sensing waveform. The matched-filter power image can be computed as a byproduct of channel estimation, with no additional pilot overhead. GeRaF's ability to reconstruct 3D geometry from these MF power images means that ISAC systems can provide environment maps --- enabling beam prediction, blockage forecasting, and handover optimisation --- using only the standard communication signalling.
See full treatment in ISAC Fundamentals
Key Takeaway
GeRaF bridges the gap between NeRF (camera-based) and radar imaging by using matched-filter power images as a differentiable pseudo-observation. Joint optimisation of a neural SDF (geometry) and a reflectivity network, regularised by the Eikonal constraint, recovers 3D surfaces from multi-view mmWave radar data.