On the selection of neural architectures from a supernet

Gabriel Meyer-Lee; Nick Cheney

On the selection of neural architectures from a supernet

Gabriel Meyer-Lee, Nick Cheney

Published: 16 May 2023, Last Modified: 15 Sept 2023AutoML 2023 WorkshopReaders: Everyone

Keywords: neural architecture search, meta-analysis, evaluation, zero-shot NAS, differentiable NAS, supernet

Abstract: After DARTS provided a method utilizing a supernet to search for neural network architectures entirely through gradient descent, differentiable supernet-based methods emerged as a powerful and popular approach to efficient neural architecture search (NAS). Following works improved upon many aspects of the DARTS algorithm but generally kept the original method of selecting the final architecture, pruning the lowest magnitude architecture weights. though critiques of this approach have led to alternative architecture selection mechanisms, such as a perturbation-based method. Here we perform a broad comparative evaluation of architecture selection methods in combination with different techniques for training the supernet, and highlight the interdependence between various methods for supernet training and architecture selection mechanisms. We show the potential for improved results for many NAS supernet training methods via alternate architecture selection mechanisms relative to the pruning-based architecture selection they were introduced, and are typically evaluated, with. In evaluating architecture selection methods, we also demonstrate how zero-shot NAS methods may be effectively integrated into supernet NAS training as new architecture selection mechanisms.

Submission Checklist: Yes

Broader Impact Statement: Yes

Paper Availability And License: Yes

Code Of Conduct: Yes

Reviewers: Yes

CPU Hours: 0.5

GPU Hours: 25400

TPU Hours: 0

Evaluation Metrics: No

Steps For Environmental Footprint Reduction During Development: The entirety of experimental design and development was conducted only using NAS benchmarks. Further experiments were implemented in a larger search space only once experimental designs were were tested in the benchmark in order minimize the unnecessary training of neural networks. All initial experiments during development were conducted with the lowest number of training epochs reported. Checkpointing was implemented within neural network training scripts so that any runs which failed did not have to be restarted from scratch.

Estimated CO2e Footprint: 3292

13 Replies

Loading