SRCPT: Spatial Reconstruction Contrastive Pretext Task for Improving Few-Shot Image Classification

Published: 01 Jan 2024, Last Modified: 06 Feb 2025ICMLC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Self-supervised learning (SSL) has been widely applied in the pretraining phase of models. Among these SSL methods, the various data augmentation used in contrastive learning for constructing positive and negative sample pairs conveniently contribute to alleviating the issue of data scarcity in few-shot learning (FSL) tasks. Therefore, many approaches have introduced contrastive learning into FSL tasks. However, most of these methods only utilize the global embedding information of the entire image, making it challenging to capture and fully leverage the local visual information and structural details of image samples. To address this, we proposes a novel Spatial Reconstruction Contrastive Pretext Task (SRCPT) to enhance the FSL training objective. By constructing a two-branch network, the model can use local patches of the image for feature map reconstruction and employ spatial reconstruction weights to create a contrastive learning objective. The enhanced FSL objective of SRCPT encourages the model to capture more transferable spatial structures and local feature information, enabling the model to adapt well to new categories even with a few samples. Extensive experiments demonstrate that our proposed SRCPT method achieves state-of-the-art performance in three popular benchmark datasets across three types of few-shot image classification tasks.
Loading