Abstract: The distribution gap between training data and real-world data often causes significant performance drops in networks trained via naive supervised learning. To address this, domain generalization methods have been developed to gain robust performance in unseen domains. In this paper, we propose a single-domain generalized object detection (S-DGOD) method. Unlike previous works, we utilize both image-level and feature-level augmentations and experimentally demonstrate their synergistic effects. Image-level augmentations expand the source domain, while feature-level augmentations leverage CLIP to incorporate potential domain descriptions. Our method achieves superior performance, with 29.2% mAP on the Cityscapes-C and 37.1% mAP on the Diverse-Weather dataset.
External IDs:dblp:conf/wacv/ParkCK25
Loading