Weight Diffusion for Future: Learn to Generalize in Non-Stationary Environments

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: domain generalization, evolving pattern, diffuison model, weight generation, domain-incremental
TL;DR: A weight diffusion approach that leverages the conditional diffusion model to learn the evolving pattern of parameters across domains and further generated customized classifiers for evolving domain generalization in the domain-incremental setting.
Abstract: Enabling deep models to generalize in non-stationary environments is vital for real-world machine learning, as data distributions are often found to continually change. Recently, evolving domain generalization (EDG) has emerged to tackle the domain generalization in a time-varying system, where the domain gradually evolves over time in an underlying continuous structure. Nevertheless, it typically assumes multiple source domains simultaneously ready. It still remains an open problem to address EDG in the domain-incremental setting, where source domains are non-static and arrive sequentially to mimic the evolution of training domains. To this end, we propose Weight Diffusion (W-Diff), a novel framework that utilizes the conditional diffusion model in the parameter space to learn the evolving pattern of classifiers during the domain-incremental training process. Specifically, the diffusion model is conditioned on the classifier weights of different historical domain (regarded as a reference point) and the prototypes of current domain, to learn the evolution from the reference point to the classifier weights of current domain (regarded as the anchor point). In addition, a domain-shared feature encoder is learned by enforcing prediction consistency among multiple classifiers, so as to mitigate the overfitting problem and restrict the evolving pattern to be reflected in the classifier as much as possible. During inference, we adopt the ensemble manner based on a great number of target domain-customized classifiers, which are cheaply obtained via the conditional diffusion model, for robust prediction. Comprehensive experiments on both synthetic and real-world datasets show the superior generalization performance of W-Diff on unseen domains in the future.
Supplementary Material: zip
Primary Area: Other (please use sparingly, only use the keyword field for more details)
Submission Number: 9196
Loading