License: CC-BY-4.0
# Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks

## Abstract

Natural-gradient methods markedly accelerate the training of Physics-Informed Neural Networks (PINNs), yet their Gauss–Newton update must normally be solved in the *parameter space*, incurring a prohibitive $\mathcal{O}(n^{3})$ time complexity, where $n$ is the number of network weights. We show that exactly the same step can instead be formulated in a *generally smaller residual space* of size $m=\sum_{\gamma}N_{\gamma}d_{\gamma}$, where each residual class $\gamma$ (e.g. PDE interior, boundary, initial data) contributes $N_{\gamma}$ collocation points of output dimension $d_{\gamma}$.

Building on this insight, we introduce *Dual Natural Gradient Descent* (D-NGD). D-NGD computes the Gauss–Newton step in residual space, augments it with a **geodesic-acceleration correction** at negligible extra cost, and provides both a dense direct solver for modest $m$ and a Nyström-preconditioned conjugate-gradient solver for larger $m$.

Experimentally, D-NGD scales second-order PINN optimisation to networks with up to 12.8 million parameters, delivers one- to three-order-of-magnitude lower final $L^{2}$ error than first-order (Adam, SGD) and quasi-Newton methods, and—crucially—enables full natural-gradient training of PINNs at this scale on a single GPU.


We provide a demo for the Kovaznay flow, 
for the dense solve with geodesic acceleration: run main_DNGDgeo.py
for LBFGS ADAM SGD: run main_solvers.py



## Requirements

- Python 3.10.10 or later
- JAX 0.4.8 or later
- JAXopt 0.6 or later
- Optax 0.1.4 or later
- jaxtyping 0.2.25 or later
- Lineax
- jmp
- equinox


