Class-wise Domain Generalization: A Novel Framework for Evaluating Distributional Shift

Sarath Sivaprasad; Akshay Goindani; Mario Fritz; Vineet Gandhi

Class-wise Domain Generalization: A Novel Framework for Evaluating Distributional Shift

Sarath Sivaprasad, Akshay Goindani, Mario Fritz, Vineet Gandhi

Published: 21 Oct 2022, Last Modified: 05 May 2023NeurIPS 2022 Workshop DistShift PosterReaders: Everyone

Keywords: Domain Generalisation, Neural Networks

TL;DR: We present a new setting beyond the Traditional DG (TDG) called Class-wise DG (CWDG) benchmark, where for each class, we randomly select one of the domains and keep it aside for testing.

Abstract: Given that neural networks generalize unreasonably well in the IID setting, Out-Of-Distribution(OOD) evaluation presents a useful failure case to study their generalization performance. Recent studies have shown that a carefully trained ERM gives good performance in Domain Generalization (DG), with train samples from all domains randomly shuffled in each batch of training. Furthermore, Later studies have shown DG specific methods to boost the test performance of neural networks under distribution shift without training data being explicitly annotated with domain information. This observation is counterintuitive as the studies on the failure cases of OOD has shown that, without being trained with domain knowledge, neural networks will fit domain specific features for reducing train loss. We present a new setting beyond the Traditional DG (TDG) called the Class-wise DG (CWDG), where for each class, we randomly select one of the domains and keep it aside for testing. Despite being exposed to all domains during training, our experiments show that the performance of the neural network drops in this framework compared to TDG. We evaluate popular DG methods and show that the performance of different methods under TDG and CWDG setting are not correlated. Finally, we propose a novel method called Iterative Domain Feature Masking (IDFM) which uses domain annotations in the train data, achieving state-of-the-art results on the proposed benchmark.

1 Reply

Loading