Investigating the Role of Overparameterization While Solving the Pendulum with DeepONetsDownload PDF

Sep 27, 2021 (edited Oct 20, 2021)DLDE Workshop -- NeurIPS 2021 PosterReaders: Everyone
  • Keywords: DeepONets, Double descent, Triple descent, Pendulum equation, Differential Equations
  • Abstract: Machine learning methods have made substantial advances in various aspects of physics. In particular multiple deep-learning methods have emerged as efficient ways of numerically solving differential equations arising commonly in physics. DeepONets [1] are one of the most prominent ideas in this theme which entails an optimization over a space of inner-products of neural nets. In this work we study the training dynamics of DeepONets for solving the pendulum to bring to light some intriguing properties of it. We demonstrate that contrary to usual expectations, test error here has its first local minima at the interpolation threshold i.e when model size $\approx$ training data size. Secondly, as opposed to the average end-point error, the best test error over iterations has better dependence on model size, as in it shows only a very mild double-descent. Lastly, we show evidence that triple-descent [2] is unlikely to occur for DeepONets. [1] Lu Lu, Pengzhan Jin, and George Em Karniadakis. DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. 2020. arXiv:1910.03193 [cs.LG] [2] Ben Adlam and Jeffrey Pennington. The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization. 2020. arXiv:2008.06786 [stat.ML].
  • Publication Status: This work is unpublished.
4 Replies