Towards Size-Independent Generalization Bounds for Deep Operator Nets

Pulkit Gopalani; Sayar Karmakar; Dibyakanti Kumar; Anirbit Mukherjee

Towards Size-Independent Generalization Bounds for Deep Operator Nets

Pulkit Gopalani, Sayar Karmakar, Dibyakanti Kumar, Anirbit Mukherjee

Published: 02 Dec 2024, Last Modified: 02 Dec 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: In recent times machine learning methods have made significant advances in becoming a useful tool for analyzing physical systems. A particularly active area in this theme has been ``physics-informed machine learning'' which focuses on using neural nets for numerically solving differential equations. In this work, we aim to advance the theory of measuring out-of-sample error while training DeepONets -- which is among the most versatile ways to solve P.D.E systems in one-shot. Firstly, for a class of DeepONets, we prove a bound on their Rademacher complexity which does not explicitly scale with the width of the nets involved. Secondly, we use this to show how the Huber loss can be chosen so that for these DeepONet classes generalization error bounds can be obtained that have no explicit dependence on the size of the nets. The effective capacity measure for DeepONets that we thus derive is also shown to correlate with the behavior of generalization error in experiments.

Submission Length: Long submission (more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=Rkztn6frXZ

Changes Since Last Submission: - All the code is now consolidated in a single GitHub repository. - Image quality improved for the figure in appendix M.

Code: https://github.com/Dibyakanti/Towards-Size-Independent-Generalization-Bounds-for-Deep-Operator-Nets

Assigned Action Editor: ~Alexander_A_Alemi1

Submission Number: 2731

Loading