SAPT: Saliency Augmentation and Unsupervised Pre-trained Model Fusion for Few-Shot Object Detection

Published: 01 Jan 2024, Last Modified: 17 Apr 2025J. Signal Process. Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Object detection algorithms require a large amount of annotated data for training and optimization, which can be time-consuming, expensive, and limit model robustness and generalization. The natural world has a diverse range of target categories, and privacy and security concerns make obtaining accurate data a challenge. Researchers are developing new methods and technologies to achieve fast and accurate object detection with minimal annotations. Recent few-shot object detection methods have employed semi-supervised learning, domain adaptation, and meta-learning techniques to enable efficient knowledge transfer from base to new categories. However, these methods have not directly addressed the primary challenge of few-shot object detection, which is the lack of sufficient labeled data for new categories. This study introduces SAPT, a few-shot object detection method based on saliency data augmentation and unsupervised pre-training model fusion. In the data preprocessing stage, SAPT selects two images for detection and crops an object of the same category from each based on ground truth labels. The maximum and minimum salient circular regions of the two cropped objects are detected and blended to generate an augmented image that expands the training dataset. In the testing stage, the output of the supervised few-shot object detection and unsupervised pre-training models are integrated and mined to generate dynamic positive and negative support images, improving the detector’s accuracy. SAPT optimizes the detection process by addressing issues of insufficient data for unknown class labels and low detection accuracy in open-world scenarios. Extensive experiments on benchmark datasets MS COCO and PASCAL VOC demonstrate SAPT’s superior performance compared to existing few-shot object detection methods.
Loading