NAP2: Neural Networks Hyperparameter Optimization Using Weights and Gradients Analysis

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Supplementary Material: zip
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: hyper-parameter optimization, neural networks performance prediction, meta-learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We analyze the changes in weights and gradients in DNNs' early training stages to predict its final performance.
Abstract: Recent hyper-parameter tuning methods for deep neural networks (DNNs) generally rely on first using low-fidelity methods to identify promising configurations and then using high-fidelity methods for further evaluation. While effective, existing solutions treat DNNs as `black boxes', which limits their predictive abilities. In this work, we propose Neural Architectures Performance Prediction (NAP2), a `white box' hyperparameter optimization approach. NAP2 models the changes in the weights and gradients of the analyzed networks over time and can predict their final performance with high accuracy, even after a short training period. Our evaluation shows that NAP2 outperforms the current state-of-the-art both in its ability to identify top-performing architectures and in the amount of resources it utilizes. Moreover, we show that our approach is transferable, meaning it is possible to train NAP2 on one dataset and apply it to another.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7026
Loading