Towards Multimodal Open Set Recognition

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Open Set Recognition, Multimodal Fusion, Classification
TL;DR: We propose a new multimodal open set recognition (MMOSR) task, extending OSR to multimodal scenarios. To address the challenge of MMOSR, we introduce the multimodal representation reactivation network (MRN), achieving superior MMOSR performance.
Abstract: Open set recognition (OSR) requires deep learning models to identify unknown samples while recognizing known ones. Existing OSR studies focus on single-modal data but merely discuss how to handle multimodal data. In this paper, we propose a new task multimodal open set recognition (MMOSR), extending OSR to more practical scenarios. First, we analyze the necessity of MMOSR and provide insights into the task. We find that simply combining OSR and multimodal fusion methods faces the challenge of fusion degradation. The main reason is that the OSR regularization constrains the fused representations to be excessively compact, leading to deactivated and limited representations. We design the multimodal representation reactivation network (MRN) to alleviate fusion degradation by reactivating suppressed representations. MRN includes the mutually enhanced fusion for enhancing representations and performing cross-modal interaction, and the adaptive fusion for capturing multiple informative representations and outputting the adaptively fused prediction. Thus, the proposed method obtains effective and comprehensive multimodal representations and addresses the challenge of fusion degradation. Finally, extensive experiments on various settings demonstrate that the proposed method is superior to existing methods by up to 5.23\% on OSCR.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9111
Loading