Can MLLMs Really Understand Knowledge Graph?  A Comprehensive Evaluation

Can MLLMs Really Understand Knowledge Graph? A Comprehensive Evaluation

ACL ARR 2025 February Submission2036 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Although Multimodal Large Language Models (MLLMs) have achieved remarkable performance in various complex tasks, they still face challenges in understanding Knowledge Graphs (KGs), which are typical graphs with structured semantics. In this paper, we conduct a comprehensive evaluation to assess the capability of MLLMs in this aspect and investigate key factors influencing their performance in understanding and reasoning over KGs across different dimensions, with a particular focus on factors related to the triple recognition of KGs. Our study yields several key findings and insights that contribute to advancing this research domain. We find that MLLMs indeed have limitations in understanding complicated KGs, which is primarily attributed to the poor recognition ability of textual triples in KGs, particularly for graphs with special layouts or high density. On this basis, we propose a fine-tuning method to enhance the understanding capabilities of MLLMs on KGs, achieving an accuracy increase of 7.3\% compared to baseline model.

Paper Type: Long

Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond

Research Area Keywords: Evaluation of Knowledge Graph Understanding, Multimodal Large Lauguage Model

Contribution Types: Model analysis & interpretability, Data analysis

Languages Studied: English

Submission Number: 2036

Loading