DiffAutoML: Differentiable Joint Optimization for Efficient End-to-End Automated Machine Learning

Kaichen Zhou; Lanqing HONG; Fengwei Zhou; Binxin Ru; Zhenguo Li; Trigoni Niki; Jiashi Feng

DiffAutoML: Differentiable Joint Optimization for Efficient End-to-End Automated Machine Learning

Kaichen Zhou, Lanqing HONG, Fengwei Zhou, Binxin Ru, Zhenguo Li, Trigoni Niki, Jiashi Feng

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Differentiable, Automated machine learning, Neural Architecture Search, Data Augment, Hyperparameter Optimization

Abstract: The automated machine learning (AutoML) pipeline comprises several crucial components such as automated data augmentation (DA), neural architecture search (NAS) and hyper-parameter optimization (HPO). Although many strategies have been developed for automating each component in separation, joint optimization of these components remains challenging due to the largely increased search dimension and different input types required for each component. While conducting these components in sequence is usually adopted as a workaround, it often requires careful coordination by human experts and may lead to sub-optimal results. In parallel to this, the common practice of searching for the optimal architecture first and then retraining it before deployment in NAS often suffers from architecture performance difference in the search and retraining stages. An end-to-end solution that integrates the two stages and returns a ready-to-use model at the end of the search is desirable. In view of these, we propose a differentiable joint optimization solution for efficient end-to-end AutoML (DiffAutoML). Our method performs co-optimization of the neural architectures, training hyper-parameters and data augmentation policies in an end-to-end fashion without the need of model retraining. Experiments show that DiffAutoML achieves state-of-the-art results on ImageNet compared with end-to-end AutoML algorithms, and achieves superior performance compared with multi-stage AutoML algorithms with higher computational efficiency. To the best of our knowledge, we are the first to jointly optimize automated DA, NAS and HPO in an en-to-end manner without retraining.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=WjMhV-ElU

10 Replies

Loading