ProGBA: Prompt Guided Bayesian Augmentation for Zero-shot Domain Adaptation

Jian Zou; Guanglei Yang; Tao Luo; Chun-Mei Feng; Wangmeng Zuo

ProGBA: Prompt Guided Bayesian Augmentation for Zero-shot Domain Adaptation

Jian Zou, Guanglei Yang, Tao Luo, Chun-Mei Feng, Wangmeng Zuo

Published: 11 Aug 2024, Last Modified: 20 Sept 2024ECCV 2024 W-CODA Workshop Full Paper TrackEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Zero-shot domain adaptation, bayesian learning, prompt guided

TL;DR: ProGBA is a new framework for zero-shot domain adaptation in semantic segmentation, enhancing model effectiveness with probabilistic modeling and a novel text-based loss function, setting new adaptation standards without direct data access.

Subject: One/few/zero-shot learning for autonomous perception

Confirmation: I have read and agree with the submission policies of ECCV 2024 and the W-CODA Workshop on behalf of myself and my co-authors.

Abstract: Domain adaptation is a well-established field within computer vision. Due to the common scenario of inaccessible target domain data, zero-shot domain adaptation increasingly gets more attention. Existing methods, which primarily focus on optimizing an Empirical Risk Minimization objective, tend to rely on training with discrete augmentations based on limited prompts. This strategy struggles to fully capture the complexity of the target domain, consequently diminishing the transferred model's effectiveness. In this paper, we introduce ProGBA, a novel framework that adopts a Bayesian perspective to regard the learning process in zero-shot domain adaptation as a variational inference problem. This approach aims to comprehend the distribution of domain-adaptive augmentations. Leveraging Bayesian methods' regularization capabilities, ProGBA refines the domain adaptation representation space, which helps to mitigate the overfitting risks. Specifically, ProGBA adeptly introduces the uncertainties associated with domain shifts through probabilistic modeling of residuals between the source and target domains, which reduces the model's reliance on a specific set of weights, thereby enhancing performance in the target domain. Furthermore, we adopt a pre-trained visual-language model alongside a novel text-based loss function to more accurately align the learned distribution with the actual residual distribution between the target and source domains. The comprehensive validation showcases ProGBA's potential to set a new benchmark in zero-shot domain adaptation, demonstrating ProGBA's efficacy in adapting to the target domain. Moreover, extensive experiments on cross-domain semantic segmentation also underscore our method's generalizability.

Supplementary Material: pdf

Submission Number: 4

Loading