Evolved Optimizer for Vision

Xiangning Chen; Chen Liang; Da Huang; Esteban Real; Yao Liu; Kaiyuan Wang; Cho-Jui Hsieh; Yifeng Lu; Quoc V Le

Evolved Optimizer for Vision

Xiangning Chen, Chen Liang, Da Huang, Esteban Real, Yao Liu, Kaiyuan Wang, Cho-Jui Hsieh, Yifeng Lu, Quoc V Le

Published: 16 May 2022, Last Modified: 05 May 2023AutoML 2022 (Late-Breaking Workshop)Readers: Everyone

Abstract: We present an optimizer, Hero-Lion (EvoLved Sign Momentum), discovered by evolutionary search from basic math operations in the AutoML-Hero project. It keeps track of only the momentum and leverages the sign operation to calculate the update to the weights. Despite the simplicity, Hero-Lion outperforms the commonly used optimizer, such as AdamW, AdafactorW, and SGD with momentum, for training a variety of architectures on different tasks. Notably, it improves the accuracy of Vision Transformer for up to 2\% when trained from scratch on ImageNet. When used in pre-training with larger data and model sizes, Hero-Lion still outperforms AdamW and AdafactorW, and can save 2-5x compute. On JFT-300M, ViT-L/16 trained by Hero-Lion matches the accuracy of the previous ViT-H/14 trained by AdamW. By replacing AdafactorW with Hero-Lion, we improve the ImageNet accuracy of ViT-G/14, pre-trained on JFT-3B, from 90.45\% to 90.71\%. Besides, Hero-Lion improves the contrastive pre-training of multi-modal Transformers by achieving $\sim$1\% gain of ImageNet zero-shot accuracy.

Keywords: Evolution, Optimizer, Vision

One-sentence Summary: Present an evolved optimizer for the vision domain

Reproducibility Checklist: Yes

Broader Impact Statement: Yes

Paper Availability And License: Yes

Code Of Conduct: Yes

Reviewers: Xiangning Chen, xiangning@cs.ucla.edu

1 Reply

Loading