Keywords: Multimodal Out-of-Distribution Detection
TL;DR: We propose a self multimodal OOD distillation framework that leverages dark knowledge from unimodal experts to enhance multimodal OOD detection.
Abstract: Out-of-distribution (OOD) detection is crucial for the safe deployment of deep neural models in applications such as autonomous driving. With the emerging multimodal nature of modern applications, recent attention has shifted toward OOD detection in multimodal settings. However, current multimodal OOD detection methods fail to fully exploit the synergy among modalities: they treat all modalities equally, disregarding their varying detection performance, and they are unable to capture the diverse uncertainty information encoded at the logit level. In this paper, we propose to exploit the dark knowledge within unimodal experts as the key to revealing their synergy. To this end, we introduce a self multimodal OOD distillation framework, which leverages logits as uncertainty-aware soft targets to train a holistic model that operates in the joint embedding space of all modalities. Specifically, the proposed framework accounts for the negative effects of underperforming modalities and effectively fuses both the rich feature-level knowledge and the logit-level knowledge of modalities. As a result, our method improves the performance of current state-of-the-art multimodal OOD detection methods, achieving gains of up to 30% across diverse OOD detection benchmarks, spanning two tasks and five multimodal OOD datasets.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 23935
Loading