Novel Domain Extrapolation with Large Language Models

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: domain generalization, out-of-distribution generalization
Abstract: We study Domain Generalization (DG), which evaluates models' ability to generalize to unseen test domains. Various augmentation strategies, such as domain augmentation, have been proposed to mitigate this issue. However, many of these methods largely rely on interpolating existing domains and frequently face difficulties in creating truly ``novel'' domains. We introduce a novel approach to domain extrapolation that leverages the extensive knowledge encapsulated within large language models (LLMs) to synthesize entirely new domains. Starting with the class of interest, we query the LLMs to extract relevant knowledge for these novel domains. We then bridge the gap between the text-centric knowledge derived from LLMs and the pixel input space of the model using text-to-image generation techniques. By augmenting the training set of domain generalization datasets with high-fidelity, photo-realistic images of these new domains, we achieve significant improvements over all existing methods. This is demonstrated in both single and multi-domain generalization across various benchmarks. Our empirical findings support our argument that the knowledge from the LLMs and a realization that can bridge the text-driven knowledge and the pixel input space is adequate to learn a generalized model for any task. To illustrate, we put forth a much more difficult setting termed, data-free domain generalization, that aims to learn a generalized model in the absence of any collected data. Surprisingly, our proposed method exhibits commendable performance in this setting, even surpassing the supervised setting by approximately 1-2% on datasets such as VLCS.
Supplementary Material: pdf
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6461
Loading