Investigating the Role of Overparameterization While Solving the Pendulum with DeepONetsDownload PDF

Published: 17 Oct 2021, Last Modified: 05 May 2023DLDE Workshop -- NeurIPS 2021 PosterReaders: Everyone
Keywords: DeepONets, Double descent, Triple descent, Pendulum equation, Differential Equations
Abstract: Machine learning methods have made substantial advances in various aspects of physics. In particular multiple deep-learning methods have emerged as efficient ways of numerically solving differential equations arising commonly in physics. DeepONets [1] are one of the most prominent ideas in this theme which entails an optimization over a space of inner-products of neural nets. In this work we study the training dynamics of DeepONets for solving the pendulum to bring to light some intriguing properties of it. We demonstrate that contrary to usual expectations, test error here has its first local minima at the interpolation threshold i.e when model size $\approx$ training data size. Secondly, as opposed to the average end-point error, the best test error over iterations has better dependence on model size, as in it shows only a very mild double-descent. Lastly, we show evidence that triple-descent [2] is unlikely to occur for DeepONets. [1] Lu Lu, Pengzhan Jin, and George Em Karniadakis. DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. 2020. arXiv:1910.03193 [cs.LG] [2] Ben Adlam and Jeffrey Pennington. The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization. 2020. arXiv:2008.06786 [stat.ML].
Publication Status: This work is unpublished.
4 Replies

Loading