S3OD: Size-unbiased semi-supervised object detection in aerial images

Ruixiang Zhang, Chang Xu, Fang Xu, Wen Yang, Guangjun He, Huai Yu, Gui-Song Xia

Published: 01 Mar 2025, Last Modified: 03 Nov 2025ISPRS Journal of Photogrammetry and Remote SensingEveryoneRevisionsCC BY-SA 4.0
Abstract: Aerial images present significant challenges to label-driven supervised learning, in particular, the annotation of substantial small-sized objects is a highly laborious process. To maximize the utility of scarce labeled data alongside the abundance of unlabeled data, we present a semi-supervised learning pipeline tailored for label-efficient object detection in aerial images. In our investigation, we identify three size-related biases inherent in semi-supervised object detection (SSOD): pseudo-label imbalance, label assignment imbalance, and negative learning imbalance. These biases significantly impair the detection performance of small objects. To address these issues, we propose a novel Size-unbiased Semi-Supervised Object Detection (S3<math><msup is="true"><mrow is="true"></mrow><mrow is="true"><mn is="true">3</mn></mrow></msup></math>OD) pipeline for aerial images. The S3<math><msup is="true"><mrow is="true"></mrow><mrow is="true"><mn is="true">3</mn></mrow></msup></math>OD pipeline comprises three key components: Size-aware Adaptive Thresholding (SAT), Size-rebalanced Label Assignment (SLA), and Teacher-guided Negative Learning (TNL), all aimed at fostering size-unbiased learning. Specifically, SAT adaptively selects appropriate thresholds to filter pseudo-labels for objects at different scales. SLA balances positive samples of objects at different sizes through resampling and reweighting. TNL alleviates the imbalance in negative samples by leveraging insights from the teacher model, enhancing the model’s ability to discern between object and background regions. Extensive experiments on DOTA-v1.5 and SODA-A demonstrate the superiority of S3<math><msup is="true"><mrow is="true"></mrow><mrow is="true"><mn is="true">3</mn></mrow></msup></math>OD over state-of-the-art competitors. Notably, with merely 5% SODA-A training labels, our method outperforms the fully supervised baseline by 2.17 points. Codes are available at https://github.com/ZhangRuixiang-WHU/S3OD/tree/master.
Loading