NAS-Bench-360: Benchmarking Diverse Tasks for Neural Architecture Search

Renbo Tu; Mikhail Khodak; Nicholas Carl Roberts; Ameet Talwalkar

NAS-Bench-360: Benchmarking Diverse Tasks for Neural Architecture Search

Renbo Tu, Mikhail Khodak, Nicholas Carl Roberts, Ameet Talwalkar

Published: 28 Jan 2022, Last Modified: 04 May 2025ICLR 2022 SubmittedReaders: Everyone

Keywords: automated machine learning, neural architecture search

Abstract: Most existing neural architecture search (NAS) benchmarks and algorithms prioritize performance on well-studied tasks, e.g., image classification on CIFAR and ImageNet. This makes the applicability of NAS approaches in more diverse areas inadequately understood. In this paper, we present NAS-Bench-360, a benchmark suite for evaluating state-of-the-art NAS methods for convolutional neural networks (CNNs). To construct it, we curate a collection of ten tasks spanning a diverse array of application domains, dataset sizes, problem dimensionalities, and learning objectives. By carefully selecting tasks that can both interoperate with modern CNN-based search methods but that are also far-afield from their original development domain, we can use NAS-Bench-360 to investigate the following central question: do existing state-of-the-art NAS methods perform well on diverse tasks? Our experiments show that a modern NAS procedure designed for image classification can indeed find good architectures for tasks with other dimensionalities and learning objectives; however, the same method struggles against more task-specific methods and performs catastrophically poorly on classification in non-vision domains. The case for NAS robustness becomes even more dire in a resource-constrained setting, where a recent NAS method provides little-to-no benefit over much simpler baselines. These results demonstrate the need for a benchmark such as NAS-Bench-360 to help develop NAS approaches that work well on a variety of tasks, a crucial component of a truly robust and automated pipeline. We conclude with a demonstration of the kind of future research our suite of tasks will enable. All data and code is made publicly available.

One-sentence Summary: We provide a benchmark for neural architecture search on a diverse set of understudied tasks.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/nas-bench-360-benchmarking-diverse-tasks-for/code)

17 Replies

Loading