Optimization for Neural Operator Learning: Wider Networks are Better

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Restricted Strong Convexity, Operator Learning, Fourier Neural Operators, Deep Operator Networks, Gradient Descent
TL;DR: We show optimization guarantees for overparameterized neural operators -- fourier neural operators (FNOs) and deep operator networks (DONs)
Abstract: Neural Operators, such as Deep Operator Networks (DONs) (Lu et al., 2021) and Fourier Neural Operators (FNOs) (Li et al., 2021a), that directly learn mappings between function spaces have received considerable recent attention. Despite the universal approximation guarantees for DONs (Lu et al., 2021; Chen & Chen, 1995) and FNOs (Kovachki et al., 2021), there is currently no optimization conver- gence guarantee for learning such networks using gradient descent (GD). In this paper, we present a unified framework for optimization based on GD and apply the framework to DONs and FNOs, establishing convergence guarantees for both. In particular, we show that as long two conditions—restricted strong convexity (RSC) and smoothness—are satisfied by the loss, GD is guaranteed to decrease the loss geometrically. Subsequently, we show that the two conditions are indeed satisfied by the DON and FNO losses, but because of rather different reasons that arise as a result of differences in the structure of the respective models. One takeaway that emerges is that wider networks lead to better optimization convergence for both DONs and FNOs. We present empirical results on several canonical oper- ator learning problems to show that wider DONs and FNOs lead to lower training losses, thereby supporting the theoretical results.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7991
Loading