RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs

ACL ARR 2025 July Submission700 Authors

28 Jul 2025 (modified: 01 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Answering complex real-world questions often requires accurate retrieval from textual knowl- edge graphs (TKGs), as the relational path information from TKGs could enhance the inference ability of Large Language Models (LLMs). However, the bottlenecks include the scarcity of existing TKGs, the limited expres- siveness of their topological structures, and the lack of comprehensive evaluations of current retrievers on TKGs. To tackle these challenges, we first develop a Dataset1 for LLMs Complex Reasoning over Textual Knowledge Graphs (RiTeK) with a broad topological structure cov- erage. We synthesize realistic user queries that integrate diverse topological structures, rela- tional information, and complex textual de- scriptions. We conduct rigorous expert eval- uation to validate the quality of our synthesized queries. RiTeK also serves as a comprehen- sive benchmark dataset designed to evaluate the capabilities of retrieval systems built on LLMs. By assessing 11 representative retriev- ers on this benchmark, we observe that existing methods struggle to perform well, revealing notable limitations in current LLM-driven re- trieval approaches. These findings highlight the pressing need for more effective retrieval systems tailored for semi-structured data.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: Textual Knowledge Graphs, complex reasoning
Contribution Types: Data resources
Languages Studied: english
Submission Number: 700
Loading