Abstract: It is common practice to reuse models initially trained on different data to increase downstream task performance. Especially in the computer vision domain, ImageNet-pretrained weights have been successfully used for various tasks. In this work, we investigate the impact of transfer learning for segmentation problems, being pixel-wise classification problems that can be tackled with encoder-decoder architectures. Given a U-Net architecture, we find that transfer learning the decoder does not help downstream segmentation tasks, while transfer learning the encoder is truly beneficial. Overall, the advantageous effect of pretrained models is strongest in low-data regimes. Our investigation is therefore motivated by a real world medical image (binary) segmentation problem, where labeled data is scarce and we study the model performances in such low-data regimes. We exemplify within our experimentation framework that pretrained weights for a decoder may yield faster convergence, but they do not improve the overall model performance as one can obtain equivalent results with randomly initialized decoders. However, we show that it is more effective to reuse encoder weights trained on a segmentation or reconstruction task than reusing encoder weights trained on classification tasks. Our findings suggest that model pretraining on large-scale segmentation datasets can provide encoder weights that are more suitable for downstream segmentation tasks than an encoder pretrained on the ImageNet classification task. We also propose a contrastive self-supervised approach with multiple self-reconstruction tasks, which provides encoders that are suitable for transfer learning in segmentation problems in the absence of segmentation labels.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We changed the following sentences:
In the Abstract:
"We demonstrate that pretrained weights for a decoder may yield faster convergence..."
-->
"We exemplify within our experimentation framework that pretrained weights for a decoder may yield faster convergence..."
"We find that transfer learning the decoder does not help downstream segmentation tasks, while transfer learning the encoder is truly beneficial."
-->
"Given a U-Net architecture, we find that transfer learning the decoder does not help downstream segmentation tasks, while transfer learning the encoder is truly beneficial."
In the Discussion:
“First, we found out that there is little advantage in transferring weights from pretrained decoders…”
->
"Given a U-Net architecture, we found out that there is little advantage in transferring weights from pretrained decoders…”
Assigned Action Editor: ~Yanwei_Fu2
Submission Number: 247
Loading