Self-Stitching: Widely Applicable and Efficient Transfer Learning Using Stitching Layer

Tanachai Anakewat; YUSUKE Mukuta; Thomas Westfechtel; Tatsuya Harada

Self-Stitching: Widely Applicable and Efficient Transfer Learning Using Stitching Layer

Tanachai Anakewat, YUSUKE Mukuta, Thomas Westfechtel, Tatsuya Harada

Published: 10 Oct 2024, Last Modified: 30 Oct 2024FITML 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Transfer Learning, Model Stitching, Domain Adaptation

TL;DR: We propose a novel transfer learning method called "Self-Stitching," which inserts a convolutional "stitching layer" into pre-trained models, achieving better adaptation to new tasks across various domain gaps and data sizes.

Abstract: Transfer learning is a widely used technique in deep learning to leverage pre-trained models for new tasks. A common approach to transfer learning under distribution shift is to fine-tune the last layer of a pre-trained model, preserving well-learned features from pre-training while also adapting to the new task or fully fine-tuning the whole model. While significant progress has been made, the increasing complexity of network designs, meta-learning algorithms, and differences in implementation details make a fair comparison difficult. Moreover, no one solution works well in all situations including situations with various target data sizes and types of domain distribution. Inspired by model stitching, we propose a simple but novel transfer learning methodology called 'Self-Stitching' that inserts one convolutional layer called 'stitching layer' inside the feature extractor of a pre-trained model. As a result, our proposed method shows an improvement compared to baselines like linear-wise and cosine-wise transfer learning. It also achieves competitive results to full fine-tuning, across various domain gaps and data sizes with fewer trainable parameters, making it widely applicable and efficient.

Submission Number: 46

Loading