Two Teachers Are Better Than One: Semi-supervised Elliptical Object Detection by Dual-Teacher Collaborative Guidance

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Elliptical Object Detection (EOD) is crucial yet challenging due to complex scenes and varying object characteristics. Existing methods often struggle with parameter configurations and lack adaptability in label-scarce scenarios. To address this, a new semi-supervised teacher-student framework, Dual-Teacher Collaborative Guidance (DTCG), is proposed, comprising a five-parameter teacher detector, a six-parameter teacher detector, and a student detector. This allows the two teachers, specializing in different regression approaches, to co-instruct the student within a unified model, preventing errors and enhancing performance. Additionally, a feature correlation module (FCM) highlights differences between teacher features and employs deformable convolution to select advantageous features for final parameter regression. A collaborative training strategy (CoT) updates the teachers asynchronously, breaking through training and performance bottlenecks. Extensive experiments conducted on two widely recognized datasets affirm the superior performance of our DTCG over other leading competitors across various semi-supervised scenarios. Notably, our method achieves a 5.61% higher performance than the second best method when utilizing only 10% annotated data.
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: By utilizing joint training with two instructors, we effectively address the boundary discontinuities associated with angular periodic regression, thereby overcoming the inherent difficulties in elliptical object detection and greatly improving the accuracy of object recognition in multimedia content. This enhancement is particularly crucial for applications requiring precise object detection, such as video surveillance and automatic image annotation. Moreover, our proposed dual-teacher pseudo-labelling framework enhances the semi-supervised learning paradigm. By incorporating various update strategies, it not only enhances robustness but also integrates additional semantic information to guide learning. This advancement has broader implications for multimodal learning, where leveraging unlabelled data can significantly improve model performance, especially in scenarios where labelled data is scarce or costly to acquire.
Supplementary Material: zip
Submission Number: 1963
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview