DomainVerse: A Benchmark Towards Real-World Distribution Shifts for Training-Free Adaptive Domain Generalization

Feng Hou, Jin Yuan, Ying Yang, Yao Zhang, Yang Liu, Yang Zhang, Cheng Zhong, Zhongchao Shi, Jianping Fan, Zhiqiang He, Yong Rui

Published: 01 Jan 2025, Last Modified: 15 Mar 2026IEEE Transactions on MultimediaEveryoneRevisionsCC BY-SA 4.0

Abstract: Traditional cross-domain tasks, including unsupervised domain adaptation (UDA), domain generalization (DG) and test-time adaptation (TTA), rely heavily on the training model by source domain data whether for specific or arbitrary target domains. With the recent advance of vision-language models (VLMs), recognized as natural source models that can be transferred to various downstream tasks without any parameter training, we propose a novel cross-domain task directly combining the strengths of both UDA and DG, named Training-Free Adaptive Domain Generalization (TF-ADG). However, current cross-domain datasets have many limitations, such as unrealistic domains, unclear domain definitions, and the inability to fine-grained domain decomposition, which hinder the real-world application of current cross-domain models due to the lack of accurate and fair evaluation of fine-grained realistic domains. These insights motivate us to establish a novel realistic benchmark for TF-ADG. Benefiting from the introduced hierarchical definition of domain shifts, our proposed dataset DomainVerse addresses these issues by providing about 0.5 million images from 390 realistic, hierarchical, and balanced domains, allowing for decomposition across multiple domains within each image. With the help of the constructed DomainVerse and VLMs, we further propose two algorithms called Domain CLIP and Domain++ CLIP for training-free adaptive domain generalization. Extensive and comprehensive experiments demonstrate the significance of the dataset and the effectiveness of the proposed methods.

External IDs:doi:10.1109/tmm.2025.3586108