Training Dynamics of Convolutional Neural Networks for Learning the Derivative Operator

Published: 10 Oct 2024, Last Modified: 09 Nov 2024SciForDL PosterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We investigate the training dynamics of convolutional neural networks for operator learning from a spectral perspective.
Abstract:

Despite significant interest in developing new methods for scientific machine learning, the behavior and effectiveness of existing methods are not thoroughly understood. For instance, while deep multi-layer perceptrons are known to exhibit a bias toward low-frequency data, whether this phenomenon holds for other methods and how it manifests are less certain. We investigate the training dynamics of convolutional neural networks in the context of operator learning and find that the input signal's high-frequency components are generally learned before its low-frequency components, followed by the amplitudes of the frequency distribution. Our results also show that networks trained on a range of frequencies tend to perform better on high-frequency data. From an architectural standpoint, increasing the model's kernel size can decrease this accuracy gap and improve precision across frequencies, but this trend does not hold for deeper models, suggesting that larger kernel sizes may have a stronger effect on training stability and model accuracy.

Style Files: I have used the style files.
Submission Number: 38
Loading