Are LLMs Really Knowledgeable for Knowledge Graph Completion?

Yang Liu, Zequn Sun, Zhoutian Shao, Yuanning Cui, Wei Hu

Published: 01 Jan 2026, Last Modified: 12 Mar 2026CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Knowledge Graph (KG) completion aims to infer new facts from existing knowledge. While recent efforts have explored leveraging large language models (LLMs) for this task, it remains unclear whether LLMs truly understand KG facts or how they utilize such knowledge in reasoning. In this work, we investigate these questions by proposing ProbeKGC, a benchmark dataset that reformulates KG completion as multiple-choice question answering with systematically controlled option difficulties. Empirical results show that LLMs often produce inconsistent answers when the same question is presented with varying distractor difficulty, suggesting a reliance on shallow reasoning such as elimination rather than genuine knowledge recall. To better quantify model confidence and knowledge grasp, we introduce Normalized Knowledge Divergence (NKD), a novel metric that complements accuracy by capturing distributional confidence in answer selection. We further analyze the influence of selection biases on LLM predictions and highlight that LLMs do not always fully exploit their stored knowledge. Finally, we evaluate three enhancement strategies and provide insights into potential directions for improving KG completion.

External IDs:doi:10.1007/978-3-032-09530-5_5