Efficient Attention for Domain Generalization

Published: 01 Jan 2023, Last Modified: 13 Nov 2024ICONIP (9) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Deep neural networks suffer severe performance degradation when encountering domain shift. Previous methods mainly focus on feature manipulation in source domains to learn transferable features to unseen domains. We propose a new perspective based on the attention mechanism, which enables the model to learn the most transferable features on source domains and dynamically focus on the most discriminative features on unseen domains. To achieve this goal, we introduce a domain-specific attention module that facilitates the identification of most transferable features in each domain. Different from channel attention, spatial information is also encoded in our module to capture global structure information of samples, which is vital for generalization performance. To minimize the parameter overhead, we also introduce a knowledge distillation formulation to train a lightweight model that has the same attention capabilities as original model. So, we align the attention weights of the student model with a specific attention weights of the teacher model that corresponding to the domain of input. The results show that the distilled model performs better than its teacher and achieves the state-of-the-art performance on several public datasets, i.e. PAC, OfficeHome and VLCS. This indicates the effectiveness and superiority of our proposed approach in terms of transfer learning and domain generalization tasks.
Loading