Accelerated Deep Learning by Gaussian Continuation

Andrew Francesco Ilersich; Prasanth B. Nair

Accelerated Deep Learning by Gaussian Continuation

Andrew Francesco Ilersich, Prasanth B. Nair

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: homotopy, continuation, optimization, deep learning

TL;DR: Gaussian Continuation is applied to deep learning problems and shown to achieve faster and less variable training performance.

Abstract: Prior work has shown that incorporating noise into the process of training deep neural networks reduces the risks of getting stuck in local minima, overfitting to the training data, and being limited by poor initialization. In this work we consider noisy training as a special case of optimization by continuation, also known as graduated non-convexity, where a convex version of the objective function is solved first and slowly morphed into the original non-convex function. When using continuation in machine learning problems, we show that saddle points require special consideration, as they may get the optimizer stuck in local minima. With a form of regularization applied to the continuation optimizer, we show on several test problems that this approach reduces the risk of being trapped in local minima, leading to better training for very deep architectures and non-convex loss functions.

Supplementary Material: zip

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 5833

Loading