GeRaF: Geometry from mmWave Radar

From NeRF to GeRaF: The Lensless Imaging Challenge

NeRF (Chapter 24) reconstructs 3D scenes from posed camera images by differentiating through volumetric rendering. The key insight is that each pixel in a camera image corresponds to a single ray through the scene --- the camera lens provides a one-to-one mapping from directions to pixels.

Radar has no lens. Each radar measurement integrates contributions from all scatterers within the antenna beam, weighted by the round-trip propagation delay and beamforming gain. This lensless imaging model means that NeRF's ray-based sampling is not directly applicable. GeRaF (Lu et al., 2024) resolves this by working with matched-filter (MF) power images instead of raw measurements: the MF concentrates energy at the correct spatial locations, producing a pseudo-image that can be rendered differentiably through the neural SDF.

,

Definition:

Matched-Filter Power Image

Given radar measurements y\mathbf{y} and the sensing matrix A\mathbf{A}, the matched-filter (MF) image is (cf. Chapter 13):

c^BP=AHDβˆ’1y,\hat{\mathbf{c}}^{\text{BP}} = \mathbf{A}^{H} \mathbf{D}^{-1} \mathbf{y},

where D=diag(AHA)\mathbf{D} = \text{diag}(\mathbf{A}^{H} \mathbf{A}) normalises the columns. The MF power image is the squared magnitude:

PMF(pq)=∣c^qBP∣2,q=1,…,Q.P_{\text{MF}}(\mathbf{p}_{q}) = |\hat{c}^{\text{BP}}_q|^2, \qquad q = 1, \ldots, Q.

The MF power image concentrates energy at scatterer locations but has limited resolution (governed by the point spread function) and contains sidelobes. GeRaF treats PMFP_{\text{MF}} as the "observation" that the neural SDF must explain.

Definition:

GeRaF Scene Model

GeRaF represents the scene with two neural networks:

  1. Geometry network (neural SDF): fθ(p)∈Rf_\theta(\mathbf{p}) \in \mathbb{R}, with the surface at {p:fθ(p)=0}\{\mathbf{p} : f_\theta(\mathbf{p}) = 0\}.

  2. Reflectivity network: Γϕ(p)∈R+\Gamma_\phi(\mathbf{p}) \in \mathbb{R}_+, predicting the surface reflectivity at each point.

Both networks share a positional encoding Ξ³(p)\gamma(\mathbf{p}) and are trained jointly. The SDF determines where scattering occurs (the surface); the reflectivity network determines how strongly the surface scatters.

Definition:

RF Volumetric Rendering for Lensless Imaging

GeRaF renders the MF power image at voxel pq\mathbf{p}_{q} by integrating along the line from each Tx--Rx pair through pq\mathbf{p}_{q}. The predicted MF power is:

P^MF(pq;ΞΈ,Ο•)=βˆ‘m=1NTxβˆ‘n=1NRxwmn(pq) Γϕ(pq) δσ(fΞΈ(pq)),\hat{P}_{\text{MF}}(\mathbf{p}_{q}; \theta, \phi) = \sum_{m=1}^{N_{\text{Tx}}} \sum_{n=1}^{N_{\text{Rx}}} w_{mn}(\mathbf{p}_{q})\,\Gamma_\phi(\mathbf{p}_{q})\, \delta_\sigma(f_\theta(\mathbf{p}_{q})),

where:

  • wmn(pq)w_{mn}(\mathbf{p}_{q}) encodes the beamforming gain and propagation loss for the (m,n)(m,n) Tx--Rx pair at voxel qq,
  • δσ(s)=1Οƒexp⁑(βˆ’s2/(2Οƒ2))\delta_\sigma(s) = \frac{1}{\sigma}\exp(-s^2/(2\sigma^2)) is a smooth approximation to the Dirac delta, concentrating the scattering contribution near the surface fΞΈ=0f_\theta = 0.

The parameter Οƒ\sigma controls the "surface thickness": as Οƒβ†’0\sigma \to 0, scattering concentrates on the zero level set.

The key difference from NeRF: in NeRF, each camera pixel corresponds to one ray, and density Οƒ(p)\sigma(\mathbf{p}) is integrated along that ray. In GeRaF, each "pixel" in the MF power image already aggregates contributions from all Tx--Rx pairs, and the rendering maps the neural SDF to this aggregated quantity.

Theorem: GeRaF Training Objective

The GeRaF training loss is:

L(ΞΈ,Ο•)=1Qβˆ‘q=1Q(PMF(pq)βˆ’P^MF(pq;ΞΈ,Ο•))2+Ξ»eik Leik(ΞΈ),\mathcal{L}(\theta, \phi) = \frac{1}{Q}\sum_{q=1}^{Q} \bigl(P_{\text{MF}}(\mathbf{p}_{q}) - \hat{P}_{\text{MF}}(\mathbf{p}_{q}; \theta, \phi)\bigr)^2 + \lambda_{\text{eik}}\,\mathcal{L}_{\text{eik}}(\theta),

where the Eikonal regularizer is

Leik(ΞΈ)=1∣Pβˆ£βˆ‘p∈P(βˆ₯βˆ‡fΞΈ(p)βˆ₯βˆ’1)2,\mathcal{L}_{\text{eik}}(\theta) = \frac{1}{|\mathcal{P}|} \sum_{\mathbf{p} \in \mathcal{P}} \bigl(\|\nabla f_\theta(\mathbf{p})\| - 1\bigr)^2,

with P\mathcal{P} a set of random 3D sample points.

The first term enforces data fidelity: the rendered power must match the observed MF power. The second term enforces geometric validity: the neural network must output a valid SDF.

Without the Eikonal regulariser, the network could satisfy the data term by producing an arbitrary scalar field that happens to match the power image but has no geometric meaning. The Eikonal loss constrains the output to be a distance function, ensuring that the zero level set defines a meaningful surface.

GeRaF Pipeline: MF Power to SDF Reconstruction

Visualise the GeRaF reconstruction pipeline. The left panel shows the input MF power image; the middle panel shows the evolving neural SDF during training; the right panel shows the extracted zero level set. Adjust the number of training iterations and the Eikonal regularisation weight.

Parameters
200
0.1

MF Power Image Resolution vs. SDF Reconstruction

Compare the MF power image (limited by the PSF resolution) with the SDF-based surface reconstruction. Observe how the neural SDF recovers sharper geometry than the MF image alone by leveraging the Eikonal constraint and the surface prior.

Parameters
16
20

Example: GeRaF Reconstruction of a Spherical Scatterer

A spherical scatterer of radius R=0.1R = 0.1 m is located at the origin. A 4Γ—44 \times 4 mmWave MIMO array at distance 22 m operates at f0=60f_0 = 60 GHz with W=4W = 4 GHz. Compute the MF power image and show how the GeRaF loss drives the neural SDF toward the true sphere SDF.

πŸŽ“CommIT Contribution(2024)

GeRaF: Geometry Reconstruction from mmWave Radar

Y. Lu, G. Caire β€” IEEE International Conference on Communications (ICC)

Lu and Caire introduced GeRaF, the first method to reconstruct 3D geometry from mmWave radar using neural signed distance functions. The key insight is that matched-filter power images --- readily computable from standard radar measurements --- serve as a differentiable "pseudo-image" that bridges the gap between lensless radar and lens-based camera imaging (NeRF).

GeRaF jointly estimates geometry (via a neural SDF) and surface reflectivity from multi-view radar data, demonstrating that sub-wavelength surface recovery is possible when the Eikonal constraint is enforced. This work establishes the SDF representation as the foundation for neural RF scene reconstruction, extended in Chapters 26 (Gaussian splatting) and later chapters.

rf-imagingsdfmmwaveneural-reconstruction
⚠️Engineering Note

Computational Cost of GeRaF Training

GeRaF training involves: (1) evaluating the neural SDF at sampled 3D points, (2) computing the rendered MF power via the lensless rendering equation, (3) computing the Eikonal loss via automatic differentiation of the SDF gradient, and (4) backpropagating through all three.

For a scene discretised into Q=1283β‰ˆ2Γ—106Q = 128^3 \approx 2 \times 10^6 voxels, each training iteration requires O(Q)O(Q) forward passes through the MLP plus O(Q)O(Q) gradient computations for the Eikonal term. On a single GPU (NVIDIA A100), training typically converges in ∼5\sim 5 minutes for simple scenes and ∼30\sim 30 minutes for complex indoor environments.

Practical Constraints
  • β€’

    GPU memory: the Eikonal gradient computation requires storing intermediate activations

  • β€’

    Batch size: limited by GPU memory; typical batch = 4096 points

  • β€’

    MLP depth: deeper networks (8 layers) improve quality but slow training

Common Mistake: Omitting Eikonal Regularization in GeRaF

Mistake:

Training the GeRaF model without the Eikonal regulariser (Ξ»eik=0\lambda_{\text{eik}} = 0), expecting the data fidelity term alone to produce a valid SDF.

Correction:

Without Eikonal regularisation, the geometry network produces an arbitrary scalar field that fits the MF power but has no distance-function structure. The zero level set becomes noisy and disconnected, with self-intersecting surfaces. Even a small regularisation weight (Ξ»eik=0.01\lambda_{\text{eik}} = 0.01) dramatically improves surface quality.

Why This Matters: Matched-Filter Power Images in ISAC Systems

In an ISAC system (Chapter 29), the communication waveform doubles as a sensing waveform. The matched-filter power image can be computed as a byproduct of channel estimation, with no additional pilot overhead. GeRaF's ability to reconstruct 3D geometry from these MF power images means that ISAC systems can provide environment maps --- enabling beam prediction, blockage forecasting, and handover optimisation --- using only the standard communication signalling.

See full treatment in ISAC Fundamentals

Key Takeaway

GeRaF bridges the gap between NeRF (camera-based) and radar imaging by using matched-filter power images as a differentiable pseudo-observation. Joint optimisation of a neural SDF (geometry) and a reflectivity network, regularised by the Eikonal constraint, recovers 3D surfaces from multi-view mmWave radar data.

Sphere Tracing on an SDF for RF Reconstruction

The sphere-tracing algorithm marches along a ray, stepping by the SDF value at each point. Circles show the guaranteed empty balls: each step is the largest safe advance without crossing the surface. This is the core rendering primitive in GeRaF β€” converting an SDF into rendered power images that are compared against matched-filter observations during end-to-end training.