Physics-Informed and Equivariant Networks
Why Embed Physics in Neural Networks?
The networks we have seen so far (U-Nets, unfolded algorithms, diffusion models) learn from data alone. When data is scarce or the problem has known physical structure, we can do better by embedding the governing equations directly into the network architecture or loss function. This section covers three approaches:
- PINNs: Enforce PDEs (Helmholtz, Maxwell) as soft constraints in the loss.
- Equivariant networks: Build symmetries (rotation, translation) into the architecture.
- Fourier Neural Operators: Learn resolution-independent PDE solution operators in the Fourier domain.
Definition: Physics-Informed Neural Network (PINN)
Physics-Informed Neural Network (PINN)
A Physics-Informed Neural Network trains a neural network to approximate the solution of a PDE by incorporating the PDE residual into the loss:
where is the PDE (e.g., the Helmholtz equation), are observed data points, and are collocation points where the PDE is enforced.
The PDE residual is computed using automatic differentiation: spatial and temporal derivatives of are exact (not finite-difference approximations).
PINNs are mesh-free: no discretisation grid is needed. The collocation points can be sampled adaptively, concentrating in regions where the residual is large.
Definition: PINN for the Helmholtz Equation
PINN for the Helmholtz Equation
For time-harmonic RF wave propagation, the governing equation is the Helmholtz equation:
where is the total field, is the wavenumber, is the relative permittivity, and is the source term.
A PINN for the Helmholtz equation parameterises (complex-valued) and minimises:
For the inverse problem (unknown ), both and (or a neural parameterisation thereof) are jointly optimised.
Example: PINN for 2D Inverse Scattering
Set up a PINN to recover the permittivity distribution of an unknown object from scattered field measurements at receiver locations, using incident plane waves at frequency GHz.
Network architecture
Two networks:
- Field network: for each incident angle , with Fourier feature encoding ( for cm).
- Permittivity network: , shared across all incident angles.
Loss function
\epsilon_r^\phi8$ PDE residuals, providing strong multi-view constraints.
Training
Adam optimiser with learning rate , k iterations. Curriculum: start with low (data-driven), gradually increase to enforce the PDE constraint. The Fourier feature encoding is critical to resolve the wavelength-scale oscillations of .
Common Mistake: PINN Spectral Bias
Mistake:
Using a standard MLP (without positional encoding or Fourier features) for a PINN solving a high-frequency Helmholtz equation, and finding the network learns only the low-frequency components.
Correction:
Standard MLPs suffer from spectral bias: they learn low-frequency components much faster than high-frequency ones. For the Helmholtz equation at frequency , the solution oscillates at spatial scale ; if is small relative to the domain, the MLP cannot represent the oscillations.
Solutions:
- Fourier feature encoding: with sampled from , where .
- Multi-scale architecture: Separate networks for different frequency bands.
- Curriculum training: Start with low frequency and gradually increase.
Definition: Equivariance and Equivariant Networks
Equivariance and Equivariant Networks
A function is equivariant to a group of transformations if applying to the input produces a corresponding transformation of the output:
Special case (invariance): If for all .
Equivariant neural networks build symmetries into the architecture:
| Symmetry Group | Transformation | Architecture |
|---|---|---|
| Spatial shifts | Standard CNN | |
| 2D rotation | Steerable CNN | |
| 3D rotation | Spherical CNN | |
| Euclidean (rotation + translation) | EGNN, PaiNN |
The key idea: replace the standard convolution kernel (defined on ) with a kernel on the group , ensuring that the output transforms predictably under group actions.
For RF imaging, the relevant symmetries include rotation equivariance for SAR (the scene reflectivity should not depend on imaging geometry orientation), and translation equivariance (standard CNNs).
Definition: Fourier Neural Operator (FNO)
Fourier Neural Operator (FNO)
The Fourier Neural Operator learns a mapping between function spaces by parameterising the integral kernel in the Fourier domain:
where is the Fourier transform and is a learnable weight tensor applied to the first Fourier modes. The full FNO layer is:
Cost: per layer (FFT). Resolution independence: Fourier modes are truncated at , independent of discretisation .
FNO is the neural-operator analog of a spectral method in numerical PDEs. For the Helmholtz equation, where convolution with the Green's function is naturally expressed in the Fourier domain, FNO is a particularly natural architecture.
Theorem: Universal Approximation for Neural Operators
Let be a continuous operator between Banach spaces of functions on a bounded domain . For any , there exists a neural operator with finitely many parameters such that:
for any compact set .
Specifically, for the FNO with layers and Fourier modes, the approximation error scales as for operators with Sobolev regularity .
Just as standard neural networks are universal approximators for functions (vectors vectors), neural operators are universal approximators for operators (functions functions). The FNO's Fourier truncation provides spectral convergence for smooth operators.
Finite-dimensional projection
Project both input and output functions onto the first Fourier modes: for . This reduces the problem to a finite-dimensional mapping .
Universal approximation in finite dimensions
By the standard universal approximation theorem for neural networks, the finite-dimensional mapping can be approximated to arbitrary accuracy by a sufficiently wide/deep network.
Truncation error
The projection error is bounded by the tail of the Fourier series: for . Combined with the Lipschitz continuity of , the total error is .
FNO vs. CNN Convergence on Helmholtz Inverse Problem
Compare the test error of FNO and a standard U-Net CNN on the Helmholtz inverse problem (permittivity scattered field) as a function of training set size and Fourier modes. The FNO converges faster and generalises better, especially at higher frequencies where the solution has more spatial oscillations. The FNO's resolution independence means it can be trained on a grid and evaluated on without retraining.
Parameters
Example: FNO as a Differentiable Forward Model for RF Imaging
Describe how to use an FNO to replace the full-wave solver in an iterative reconstruction algorithm for RF imaging.
Training data generation
Generate random permittivity distributions (random shapes with varying ) and compute the scattered fields using MoM or FDTD. Input: on a grid. Output: (complex-valued) on the same grid.
FNO architecture
- Lifting layer: 1-channel input () channels.
- 4 Fourier layers with modes per dimension.
- Projection layer: channels 2 channels (real/imaginary parts of ).
- Total: M parameters.
Iterative reconstruction with FNO forward model
Replace the physics-based forward operator in an iterative algorithm (e.g., gradient descent, ADMM) with the trained FNO :
The gradient is computed by backpropagating through the FNO ( ms, vs. s for the full-wave adjoint).
Why This Matters: Physics-Informed Methods for RF Imaging
Physics-informed approaches address key challenges in RF imaging:
-
PINNs for inverse scattering: Recover from sparse scattering data by enforcing the Helmholtz equation as a soft constraint — particularly valuable when the measurement geometry is irregular and standard discretisation is awkward.
-
Equivariant networks for SAR: Steerable CNNs achieve rotation-equivariant SAR target recognition without data augmentation — the network generalises to unseen azimuth angles by construction.
-
FNO as fast forward model: Replace the matrix-vector product with an FNO evaluation, enabling real-time iterative reconstruction.
-
FNO for resolution transfer: Train on coarse grids (fast), evaluate on fine grids (detailed) — exploiting the FNO's resolution independence.
Quick Check
What is the key advantage of an FNO over a PINN for solving multiple instances of the Helmholtz equation with different permittivity distributions?
FNO is more accurate for a single instance
FNO learns the solution operator, amortising the cost across instances
FNO does not require training data
FNO has fewer parameters
A PINN must be retrained for each new permittivity distribution (each new PDE instance). The FNO learns the mapping as an operator, so once trained, it solves new instances in a single forward pass ( ms).
Common Mistake: FNO Out-of-Distribution Generalisation
Mistake:
Training an FNO on simple permittivity distributions (circles, rectangles) and expecting it to generalise to complex real-world scenes (buildings, furniture) without testing.
Correction:
Neural operators, like all learned models, generalise only within the distribution of the training data. An FNO trained on simple geometries may fail dramatically on complex scenes with sharp corners, thin structures, or high-contrast interfaces.
Remedies: (i) Train on a diverse dataset that covers the expected scene complexity. (ii) Use physics-based fine-tuning: initialise with the FNO prediction, then refine with a few iterations of the physics-based solver. (iii) Hybrid approaches: FNO for the smooth part, physics solver for the high-contrast details.
Historical Note: From Galerkin to PINNs
1998--2021The idea of using neural networks to solve PDEs dates to Lagaris et al. (1998), who used small feedforward networks with boundary condition constraints. The approach remained niche until Raissi, Perdikaris, and Karpathy (2019) demonstrated that modern deep networks with automatic differentiation could solve complex forward and inverse PDE problems — coining the term "Physics-Informed Neural Networks." The Fourier Neural Operator (Li et al., 2021) shifted the paradigm from solving individual PDE instances to learning the solution operator, enabling real-time PDE solving with spectral accuracy.
PINN Training for Helmholtz Inverse Scattering
Complexity: where is the network parameter count. The second-order autodiff for the Laplacian doubles the backpropagation cost.Curriculum scheduling of (low initially, increasing over epochs) helps avoid getting trapped in the data-only minimum.
PINN (Physics-Informed Neural Network)
A neural network trained with a loss function that includes the residual of a governing PDE, evaluated at collocation points via automatic differentiation. Enables mesh-free PDE solving and inverse problems.
FNO (Fourier Neural Operator)
A neural operator architecture that parameterises integral kernels in the Fourier domain, achieving cost and resolution-independent PDE solving.
Related: Fourier Neural Operator (FNO)
Key Takeaway
PINNs embed PDE constraints (Helmholtz, Maxwell) directly into the neural network loss, enabling mesh-free inverse scattering but suffering from spectral bias at high frequencies. Equivariant networks build physical symmetries into the architecture, improving data efficiency for SAR and 3D imaging. The Fourier Neural Operator learns the PDE solution operator in the Fourier domain, enabling real-time, resolution-independent forward modelling that can replace expensive full-wave solvers in iterative reconstruction algorithms.