MultiMisD: Multimodal Misclassification Detection

Moru Liu; Hao Dong; Olga Fink; Mario Trapp

MultiMisD: Multimodal Misclassification Detection

Moru Liu, Hao Dong, Olga Fink, Mario Trapp

14 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Misclassification Detection, Multimodal Learning

TL;DR: We introduce MultiMisD, the first framework specifically designed for multimodal misclassification detection.

Abstract: The deployment of multimodal models in safety-critical applications, such as autonomous driving and medical diagnostics, requires more than high predictive accuracy; it also demands reliable mechanisms for detecting failures. In this work, we address the largely unexplored problem of misclassification detection in multimodal settings. We present MultiMisD, a novel framework specifically designed to identify such multimodal failures. Our approach is driven by a key observation: in most misclassification cases, the confidence of the multimodal prediction is significantly lower than that of at least one unimodal branch, a phenomenon we term confidence degradation. To mitigate this, we introduce an Adaptive Confidence Loss that penalizes such degradations during training. In addition, we propose Multimodal Feature Swapping, a novel outlier synthesis technique that generates challenging, failure-aware training examples. By training with these synthetic failures, MultiMisD learns to more effectively recognize and reject uncertain predictions, thereby improving overall reliability. Extensive experiments across four datasets, three modalities, and multiple evaluation settings demonstrate that MultiMisD achieves consistent and robust gains. The source code will be publicly released.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 5184

Loading