Domain-Aware Knowledge Distillation for Continual Model Generalization

Nikhil Reddy, Mahsa Baktashmotlagh, Chetan Arora

Published: 2024, Last Modified: 13 Feb 2025WACV 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Generalization on unseen domains is critical for Deep Neural Networks (DNNs) to perform well in real-world applications such as autonomous navigation. However, catastrophic forgetting limits the ability of domain generalization and unsupervised domain adaption approaches to adapt to constantly changing target domains. To overcome these challenges, We propose DoSe framework, a Domain-aware Self-Distillation method based on batch normalization prototypes to facilitate continual model generalization across varying target domains. Specifically, we enforce the consistency of batch normalization statistics between two batches of images sampled from the same target domain distribution between the student and teacher models. To alleviate catastrophic forgetting, we introduce a novel exemplar-based replay buffer to identify difficult samples for the model to retain the knowledge. Specifically, we demonstrate that identifying difficult samples and updating the model periodically using them can help in preserving knowledge learned from previously seen domains. We conduct extensive experiments on two real-world datasets ACDC, C-Driving, and one synthetic dataset SHIFT to verify the efficiency of the proposed DoSe framework. On ACDC, our method outperforms existing SOTA in Domain Generalization, Unsupervised Domain Adaptation, and Daytime settings by 26%, 14%, and 70% respectively.