Abstract: Natural gradient methods for PINNs have achieved state-of-the-art performance with errors several orders of magnitude smaller than those achieved by standard optimizers such as ADAM or L-BFGS. However, computing natural gradients for PINNs is prohibitively computationally costly and memory-intensive for all but small neural network architectures. We develop a randomized algorithm for natural gradient descent for PINNs that uses sketching to approximate the natural gradient descent direction. We prove that the change of coordinate Gram matrix used in a natural gradient descent update has rapidly-decaying eigenvalues for a one-layer, one-dimensional neural network and empirically demonstrate that this structure holds for four different example problems. Under this structure, our sketching algorithm is guaranteed to provide a near-optimal low-rank approximation of the Gramian. Our algorithm dramatically speeds up computation time and reduces memory overhead. Additionally, in our experiments, the sketched natural gradient outperforms the original natural gradient in terms of accuracy, often achieving an error that is an order of magnitude smaller. Training time for a network with around 5,000 parameters is reduced from several hours to under two minutes. Training can be practically scaled to large network sizes; we optimize a PINN for a network with over a million parameters within a few minutes, a task for which the full Gram matrix does not fit in memory.
Lay Summary: Mathematical modeling and computing are key tools in scientific research. Among the many uses of simulations are: the identification of potential underlying mechanisms, fitting observations to theory, gaining insight into inaccessible processes, and supplementing or replacing impractical physical experiments or measurements. Physics-based models are made up of systems of Partial Differential Equations (PDEs) that often span multiple physical or temporal scales and involve multiple interacting physical phenomena. These systems are solved numerically to generate a simulation.
Although numerous tools exist for analyzing and numerically solving a variety of PDEs, many problems of scientific interest remain computationally intractable or require significant simplification. Physics Informed Neural Networks (PINNs) have been used to model multi-scale and multi-physics phenomena, numerically solve high-dimensional systems of PDEs, and to combine incomplete mechanistic understanding with data. Although physics-informed learning has shown enormous potential, these networks can be very difficult to train and don't achieve high-levels of accuracy needed for some kinds of simulations. Recently, methods to train PINNs to higher accuracy have emerged, called energy natural gradients, but they are computationally costly for all but small networks, limiting their applicability.
In our work, we come up with a method to scale natural gradient methods, making them more computationally efficient and usable for large network sizes. Our method also improves the accuracy of these natural gradients for PINNs.
Primary Area: General Machine Learning->Scalable Algorithms
Keywords: Sci-ML, Physics Informed Neural Networks, Natural Gradients, Sketching
Submission Number: 13711
Loading