LBMKGC: Large Model-Driven Balanced Multimodal Knowledge Graph Completion

Yuan Guo; Qian Ma; Hui Li; Qiao Ning; Furui Zhan; Yu Gu; Ge Yu; Shikai Guo

LBMKGC: Large Model-Driven Balanced Multimodal Knowledge Graph Completion

Yuan Guo, Qian Ma, Hui Li, Qiao Ning, Furui Zhan, Yu Gu, Ge Yu, Shikai Guo

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0

Keywords: Multi-modal Knowledge Graphs, Knowledge Graph Completion, Cross-modal Interaction, Large Vision-Language Model

TL;DR: To tackle the challenges of imbalance and heterogeneity in multimodal knowledge graph completion (MKGC), this paper proposes a novel MMKGC framework that uses a large vision-language model, cross-modal alignment, and adaptive multimodal fusion.

Abstract: Multi-modal Knowledge Graph Completion (MMKGC) aims to predict missing entities, relations, or attributes in knowledge graphs by collaboratively modeling the triple structure and multimodal information (e.g., text, images, videos) associated with entities. This approach facilitates the automatic discovery of previously unobserved factual knowledge. However, existing MMKGC methods encounter several critical challenges: (i) the imbalance of inter-entity information across different modalities; (ii) the heterogeneity of intra-entity multimodal information; and (iii) for a given entity, the informational contributions of different modalities are inconsistent across contexts. In this paper, we propose a novel **L**arge model-driven **B**alanced **M**ultimodal **K**nowledge **G**raph **C**ompletion framework, termed LBMKGC. Subsequently, to bridge the semantic gap between heterogeneous modalities, LBMKGC aligns the multimodal embeddings of entities semantically by using the CLIP (Contrastive Language-Image Pre-Training) model. Furthermore, LBMKGC adaptively fuses multimodal embeddings with relational guidance by distinguishing between the perceptual and conceptual attributes of triples. Finally, extensive experiments conducted against 21 state-of-the-art baselines demonstrate that LBMKGC achieves superior performance across diverse datasets and scenarios while maintaining efficiency and generalizability. Our code and data are publicly available at: https://github.com/guoynow/LBMKGC.

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 21361

Loading