How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS

Kaicheng Yu; Rene Ranftl; Mathieu Salzmann

How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS

Kaicheng Yu, Rene Ranftl, Mathieu Salzmann

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: autoML, neural architecture search, NAS, one-shot NAS, weight-sharing NAS, super-net

Abstract: Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware. Existing methods in this space rely on a diverse set of heuristics to design and train the shared-weight backbone network, a.k.a. the super-net. Since heuristics substantially vary across different methods and have not been carefully studied, it is unclear to which extent they impact super-net training and hence the weight-sharing NAS algorithms. In this paper, we disentangle super-net training from the search algorithm, isolate 14 frequently-used training heuristics, and evaluate them over three benchmark search spaces. Our analysis uncovers that several commonly-used heuristics negatively impact the correlation between super-net and stand-alone performance, whereas simple, but often overlooked factors, such as proper hyper-parameter settings, are key to achieve strong performance. Equipped with this knowledge, we show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.

One-sentence Summary: We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/how-to-train-your-super-net-an-analysis-of/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=4UipslUo0y

11 Replies

Loading