Abstract: Unsupervised domain adaptation (UDA) aims to adapt a model trained on the source domain (e.g. synthetic data) to the target domain (e.g. real-world data) without requiring further annotations on the target domain.
Most previous UDA methods for semantic segmentation focus on minimizing the domain discrepancy of various levels, e.g., pixels and features, for extracting domain-invariant knowledge.
However, the primary domain knowledge, such as context and detail correlation, remains underexplored.
To address this problem, we propose a context- and detail-enhanced unsupervised learning framework, called CDEA, for domain adaptive semantic segmentation that facilitates image detail correlations and contexts semantic consistency.
Firstly, we propose an adaptive masked image consistency module to enhance UDA by learning spatial context relations of the target domain, which enforces the consistency between predictions and masked target images.
Secondly, we propose a detail extraction module to enhance UDA by integrating the learning of spatial information into low-level layers, which fuses the low-level detail features with deep semantic features.
Extensive experiments verify the effectiveness of the proposed method and demonstrate the superiority of our approach over state-of-the-art methods.
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Content] Multimodal Fusion
Relevance To Conference: We propose an unsupervised learning semantic segmentation framework, which can extract the context and detail features of source domain and target domain, and improve scene understanding ability in multimedia processing.
Submission Number: 2613
Loading