Adaptive Multiple Optimal Learning Factors for Neural Network Training

TMLR Paper1921 Authors

10 Dec 2023 (modified: 23 Feb 2024)Withdrawn by AuthorsEveryoneRevisionsBibTeX
Abstract: This paper presents the Adapt-MOLF algorithm that merges the strengths of second order algorithms while addressing their limitations. Adapt-MOLF algorithm dynamically adjusts the number of weight groups per hidden unit to maximize error change per multiplication, optimizing computational efficiency. Leveraging curvature-based grouping and Gauss-Newton updates, it efficiently interpolates the Hessian and negative gradients for computation. The two-stage algorithm alternately determines output weights and employs multiple learning factors to train input weights in a Multi-Layer Perceptron. This adaptive adjustment of learning factors maximizes the error decrease per multiplication, showcasing superior performance over OWO-MOLF and Levenberg Marquardt (LM) across diverse datasets. Extensive experiments demonstrate its competitive or superior results compared to state-of-the-art algorithms particularly excelling in reducing testing errors. This research represents a promising advancement in second-order optimization methods for neural network training, offering scalability, efficiency, and superior performance across heterogeneous datasets.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Xi_Lin2
Submission Number: 1921
Loading