Low-Redundancy Knowledge Generation and Modality-Aware Interaction for Multimodal Information Extraction in Social Media

Published: 2025, Last Modified: 03 Mar 2026ICME 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multimodal information extraction (MIE) has gained increasing attention, as it helps to accomplish information extraction by adding images as auxiliary information. By acquiring entity-related knowledge, knowledge generation methods can effectively enhance the performance of information extraction models. However, current knowledge generation methods have two weaknesses: (1) they often generate knowledge that includes task-irrelevant information causing redundancy and negatively impacting model performance; (2) they typically concatenate knowledge and text input directly together, ignoring the stylistic and contextual differences arising from their different sources. To address these issues, we propose Low-Redundancy Knowledge Generation and Modality-Aware Interaction (LRKG-MAI). Our approach leverages a large language model to generate task-relevant knowledge with minimal redundancy, while treating knowledge as a distinct modality that interacts with text within its own representation space. Extensive experiments demonstrate the effectiveness of our approach. The source code can be found at https://github.com/JinFish/LRKG-MAI.
Loading