Less is More: Selective Layer Finetuning with SubTuning

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Parameter Efficient Transfer Learning, Multi-task Learning, Understanding Transfer Learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Instead of finetuning the whole pretrained network, we greedily select the best subset of layers to finetune. We achieve SOTA results on small datasets in CV.
Abstract: Finetuning a pretrained model has become the standard approach for training neural networks on novel tasks, leading to rapid convergence and enhanced performance. In this work, we present a parameter-efficient finetuning method, wherein we selectively train a carefully chosen subset of layers while keeping the remaining weights frozen at their initial (pre-trained) values. We observe that not all layers are created equal: different layers across the network contribute variably to the overall performance, and the optimal choice of layers is contingent upon the downstream task and the underlying data distribution. We demonstrate that our proposed method, termed *subset finetuning* (or SubTuning), offers several advantages over conventional finetuning. We show that SubTuning outperforms both finetuning and linear probing in scenarios with scarce or corrupted data, achieving state-of-the-art results compared to competing methods for finetuning on small datasets. When data is abundant, SubTuning often attains performance comparable to finetuning while simultaneously enabling efficient inference in a multi-task setting when deployed alongside other models. We showcase the efficacy of SubTuning across various tasks, diverse network architectures and pre-training methods.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3723
Loading