MAE are Secretly Efficient Learners

Zihao Wei; Chen Wei; Jieru Mei; Zeyu Wang; Xianhang Li; Huiyu Wang; Alan Yuille; Yuyin Zhou; Cihang Xie

MAE are Secretly Efficient Learners

Zihao Wei, Chen Wei, Jieru Mei, Zeyu Wang, Xianhang Li, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: self-supervised learning, masked autoencoder, efficient training

Abstract: Masked Autoencoders (MAE), introduced by (He et al., 2022), provides a strong framework to pre-train Vision Transformers (ViTs). In this paper, we accelerate MAE training by 59× or more while with little performance drop. Our changes are simple and straightforward: in the pre-training stage, we aggressively increase the masking ratio, decrease the training epochs, and reduce the decoder depth, for lowering pre-training cost; in the fine-tuning stage, we reveal layer-wise learning rate decay plays a vital role on unleashing the power of pre-trained models. With this setup, we are able to pre-train a ViT-B in 12.6 hours using a single the latest NVIDIA A100 GPU, which competitively attains 83.0% top-1 accuracy on the downstream ImageNet classification task. We additionally verify the speed acceleration on another MAE extension, SupMAE.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

TL;DR: we significantly accelerate MAE training by 59x or more

5 Replies

Loading