Cross-Domain Soft Adaptive Teacher for Syn2Real Object Detection

Published: 01 Jan 2023, Last Modified: 05 Mar 2025PRCV (6) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Current state-of-the-art object detectors are constructed using supervised deep-learning approaches. These approaches require a large amount of annotated training data. Although synthetic image-generation methods can provide a large amount of annotated data, unsupervised transfer of object-recognition models from synthetic to real domains is a complicated problem given the large gap between the domains. To mitigate this problem, in this paper, we propose a general synthetic-to-real cross-domain object-detection framework. In this framework, we establish a simple mean teacher model for most detectors and propose a teacher–student framework named soft adaptive teacher (SAT). This leverages domain adversarial learning and domain-adaption augmentation to address the domain gap. Specifically, we alleviate bias by augmenting training samples with image-level adaptations for the student model. Moreover, we employ feature-level adversarial training in the student model, allowing features derived from the source and target domains to share similar distributions. Finally, we introduce the soft teacher mechanism to select reliable pseudo-labels for the teacher model. By tackling the model-bias issue using these strategies, our SAT model was found to achieve average precision values of 57.2% (55.7%) on the Sim10k to Cityscape (Sim10k to BDD100k) benchmarks, 3.1 (10.4) percentage points higher than the previous state-of-the-art methods. Furthermore, we achieved an average precision of 66.2% on the dataset for object detection in aerial images (DOTA), and this is 31.2% points higher than the results from the Faster RCNN model without domain adaptation trained only with labeled source domain images.
Loading