Keywords: Restriced Strong Convexity, Operator Learning, Variational Loss, Scientific machine learning
Abstract: In this paper, we analyze the optimization of operator networks for solving elliptic PDEs with variational loss functions. While approximation and generalization errors in operator networks have been extensively studied, optimization error remains largely unexplored.
We apply Restricted Strong Convexity (RSC) theory to rigorously examine the optimization dynamics of operator networks trained with variational loss, providing theoretical guarantees for convergence and training stability.
We further investigate the role of the condition number of $A$ in optimization and demonstrate that preconditioning strategies significantly improve convergence rates, establishing a solid theoretical basis for the empirical benefits of preconditioning. We also address the lower bound of a key quantity, $q_t$, which ensures convergence.
To prevent $q_t$ from vanishing, we propose an algorithm that adaptively incorporates additional weights into the variational loss function, leveraging values already computed during training, thereby avoiding any extra computational costs.
Finally, we validate {our theoretical assumptions through numerical experiments, demonstrating their practical applicability} and confirming the effectiveness of preconditioning, with significant improvements in training performance and convergence rates.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10461
Loading