Concept-Based Unsupervised Domain Adaptation

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We propose Concept-based Unsupervised Domain Adaptation (CUDA), a framework that improves interpretability and robustness of Concept Bottleneck Models under domain shifts through adversarial training.
Abstract: Concept Bottleneck Models (CBMs) enhance interpretability by explaining predictions through human-understandable concepts but typically assume that training and test data share the same distribution. This assumption often fails under domain shifts, leading to degraded performance and poor generalization. To address these limitations and improve the robustness of CBMs, we propose the Concept-based Unsupervised Domain Adaptation (CUDA) framework. CUDA is designed to: (1) align concept representations across domains using adversarial training, (2) introduce a relaxation threshold to allow minor domain-specific differences in concept distributions, thereby preventing performance drop due to over-constraints of these distributions, (3) infer concepts directly in the target domain without requiring labeled concept data, enabling CBMs to adapt to diverse domains, and (4) integrate concept learning into conventional domain adaptation (DA) with theoretical guarantees, improving interpretability and establishing new benchmarks for DA. Experiments demonstrate that our approach significantly outperforms the state-of-the-art CBM and DA methods on real-world datasets.
Lay Summary: Many deep learning models can make impressive predictions, but their decision-making process often remains a black box. Concept Bottleneck Models (CBMs) help by making their decisions from human-understandable concepts, like "color" or "shape". However, these models usually assume that the data they used during training is similar to what they encounter in the real world. When this isn’t true — for example, if a model trained on sunny-day images is tested on rainy-day ones — their performance can drop dramatically. To solve this, we developed a new method that helps CBMs stay reliable even when facing unfamiliar data. Our approach, called Concept-Based Unsupervised Domain Adaptation, makes the model to adjust for differences between training and testing situations, without needing extra labeled examples in the new domain. We also allow for small changes between the old and new domains, providing necessary flexibility when adapting a model to a new domain. Experiments show that our method significantly outperforms existing CBM and domain adaptation approaches on real-world tasks. By bridging the gap between interpretability and adaptability, our work enables AI systems to remain both understandable and effective in changing environments.
Primary Area: Social Aspects->Accountability, Transparency, and Interpretability
Keywords: Interpretability, Concepts, Domain Adaptation
Submission Number: 1203
Loading