Dynamic Training Guided by Training Dynamics

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: training dynamics, Deep Neural Networks, Koopman Operator
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: This paper centers around a novel concept proposed recently by researchers from the control community where the training process of a deep neural network can be considered a nonlinear dynamical system acting upon the high-dimensional weight space. Koopman operator theory, a data-driven dynamical system analysis framework, can then be deployed to discover the otherwise non-intuitive training dynamics. Different from existing approaches that mainly take advantage of the prediction capability of this framework, we take a deep dive into understanding the underlying relationship between the low-dimensional Koopman modes that describe the training dynamics and the weight evolution itself, and develop two novel strategies for speeding up model convergence in an online fashion, including 1) a gradient acceleration strategy that improves training efficiency by pushing the slowly decaying Koopman modes to decay faster, and 2) a masking strategy that drastically reduces the computational complexity of gradient acceleration by analyzing the contribution of the corresponding Koopman modes in weight reconstruction. These strategies offer promising insights into pursuing faster and more efficient training methodologies and improve our understanding of training dynamics to further control and inform the training process.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4670
Loading