A Knowledge Graph-Driven Benchmark for Knowledge Conflict Detection

ACL ARR 2025 February Submission8311 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Knowledge conflict often occurs in retrieval-augmented generation (RAG) systems, where retrieved documents may be inconsistent with each other or contradict the model’s parametric knowledge. Existing benchmarks for knowledge conflict detection have notable limitations, including a narrow focus on the question answering (QA) setup, heavy reliance on entity substitution techniques, and a limited range of conflict types. To address these gaps, we propose a knowledge graph (KG)-based data construction framework for knowledge conflict detection, ensuring greater diversity, complexity, and interpretability by leveraging the explicit relational structure of KGs. Experimental results on the new benchmark provide intriguing insights into the inner workings of LLMs in relation to knowledge conflict. They show that both open-source and proprietary LLMs struggle with conflict detection, particularly in multi-hop reasoning, and often fail to pinpoint the exact source of contradictions. These findings highlight the need for more robust benchmarks and improved methodologies for enhancing LLM reliability in conflict-aware reasoning.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: Hardness of samples
Languages Studied: English
Submission Number: 8311
Loading