Keywords: Question Answering, Knowledge Driven Reasoning, Retrieval Augmented Generation
TL;DR: In this paper, we introduce GRAPE, an encoder-only framework for multi-hop QA that replaces extra data navigation calls in LLM-based RAG pipelines with path encodings, matching state-of-the-art accuracy while reducing inference time by up to 85%.
Abstract: Since the introduction of retrieval-augmented generation (RAG), a standard component of large language model (LLM) reasoning pipelines has been the navigation of a knowledge base (KB) to generate answers grounded in retrieved sources. However, recent studies show that LLMs struggle with complex queries requiring deep reasoning and interdependent knowledge, often leading to hallucinations. While several methods have been proposed to mitigate this issue, most rely on multiple additional calls to the LLM to decompose and validate reasoning steps, thereby increasing inference cost and latency. In this paper, we introduce GRAPE (Graph Reasoning with Anonymous Path Encoders), a framework that leverages path encodings over uncertain nodes and relations in knowledge graphs (KGs) to heuristically guide KB navigation. Rather than depending on a fully LLM-native retrieval pipeline, GRAPE replaces repeated model calls with encoder-only models that act as a semantic fuzzy query-matching engine. Experiments across multiple multi-hop QA benchmarks show that GRAPE achieves up to 85% faster inference than LLM-based pipelines, while consistently matching or exceeding state-of-the-art accuracy. These results demonstrate that encoder-only hybrid reasoning pipelines provide a practical and scalable alternative to expensive LLM-native retrieval, combining efficiency, robustness, and strong generalization.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 18412
Loading