Low-Redundancy Knowledge Generation and Modality-Aware Interaction for Multimodal Information Extraction in Social Media

Shizhou Huang, Bo Xu, Changqun Li, Yang Yu, Xin Lin

Published: 2025, Last Modified: 03 Mar 2026ICME 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Multimodal information extraction (MIE) has gained increasing attention, as it helps to accomplish information extraction by adding images as auxiliary information. By acquiring entity-related knowledge, knowledge generation methods can effectively enhance the performance of information extraction models. However, current knowledge generation methods have two weaknesses: (1) they often generate knowledge that includes task-irrelevant information causing redundancy and negatively impacting model performance; (2) they typically concatenate knowledge and text input directly together, ignoring the stylistic and contextual differences arising from their different sources. To address these issues, we propose Low-Redundancy Knowledge Generation and Modality-Aware Interaction (LRKG-MAI). Our approach leverages a large language model to generate task-relevant knowledge with minimal redundancy, while treating knowledge as a distinct modality that interacts with text within its own representation space. Extensive experiments demonstrate the effectiveness of our approach. The source code can be found at https://github.com/JinFish/LRKG-MAI.

External IDs:dblp:conf/icmcs/HuangXLYL25