Three-stage Training Pipeline with Patch Random Drop for Few-shot Object Detection

Shaobo Lin, Xingyu Zeng, Shilin Yan, Rui Zhao

Published: 2022, Last Modified: 11 Nov 2024ACCV (6) 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Self-supervised learning (SSL) aims to design pretext tasks for exploiting the structural information of data without manual annotation, which has been widely used in few-shot image classification for improving the generalization of the model. However, few works explore the influence of SSL on Few-shot object detection (FSOD) which is a more challenging task. Besides, our experimental results demonstrate that using a weighted sum of different self-supervised losses causes performance degradation compared to using a single self-supervised task in FSOD. To solve these problems, firstly, we introduce SSL into FSOD by applying SSL tasks to the cropped positive samples. Secondly, we propose a novel self-supervised method: patch random drop, for predicting the location of the masked image patch. Finally, we design a three-stage training pipeline to associate two different self-supervised tasks. Extensive experiments on the few-shot object detection datasets, i.e., Pascal VOC, MS COCO, validate the effectiveness of our method.