OBEPA: Object-embedding Predictive Alignment for Semi-Supervised Object Detection

Zunran Wang, Zhonghua Li, Zhijian Chen, Zhao Cao

Published: 28 Feb 2026, Last Modified: 14 Nov 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: With the rapid development of Semi-Supervised Object Detection (SSOD), the performance of object detectors has been greatly improved. Despite the promising results, the existing SSOD methods mainly focus on selecting optimal pseudo-labels or alleviating the negative impact of noisy pseudo-labels, and few works have taken attention to explore the effectiveness of embedding alignment technology for SSOD. In this paper, we propose the OBject-Embedding Predictive Alignment (OBEPA) for SSOD by introducing embedding predictor and object-embedding contrastive learning. The embedding predictor is performed by adding a small convolutional network in the student branch before embedding alignment, which can reduce the discrepancy in embeddings between the teacher and student branches. Object-embedding contrastive learning is applied by proposing a balanced hierarchical embedding clustering and an object-aware InfoNCE loss to enhance the discriminability of the embedding at the object level. Specifically, in the student branch and the teacher branch, we respectively group all pixel-level representations within one image into multiple clusters. The positive contrastive samples of each pixel embedding in the student branch come from the corresponding embedding in the teacher branch, while the negative contrastive samples are sampled from other embedding clusters. Evaluation of the proposed method on two SSOD benchmarks, including MS-COCO and Pascal VOC demonstrates its superiority against the previous state-of-the-art approaches.