Abstract: Highlights • The standard L2 regularization is not adequate for transfer learning problems. • We recommend regularizers that drive parameters towards the pre-trained model. • Experimental results in image classification and segmentation favor this scheme. • Analyses and some theoretical insights are proposed. Abstract In inductive transfer learning, fine-tuning pre-trained convolutional networks substantially outperforms training from scratch. When using fine-tuning, the underlying assumption is that the pre-trained model extracts generic features, which are at least partially relevant for solving the target task, but would be difficult to extract from the limited amount of data available on the target task. However, besides the initialization with the pre-trained model and the early stopping, there is no mechanism in fine-tuning for retaining the features learned on the source task. In this paper, we investigate several regularization schemes that explicitly promote the similarity of the final solution with the initial model. We show the benefit of having an explicit inductive bias towards the initial model. We eventually recommend that the baseline protocol for transfer learning should rely on a simple L 2 penalty using the pre-trained model as a reference. Previous article in issue Next article in issue
0 Replies
Loading