Keywords: multi-hop retrieval, graph retrieval augmented generation
Abstract: Graph-based retrieval-augmented generation increasingly relies on multi-hop retrieval, where answering a query requires composing multiple connected knowledge-graph triplets.
However, existing retrievers often rank triplets independently with global semantic matching, and common multi-hop benchmarks provide only final answers, leaving retrievers without adaption on query-triplet alignment and causing structurally necessary but weakly aligned facts to be missed. To address these issues, we propose a novel knowledge-aligned multi-hop retriever, namely KAMR, which distinguishes anchor triplets that are strongly constrained by the query from connected triplets that are weakly aligned yet structurally linked to the anchors.
To tackle the absence of query-triplet alignment, we build a partial alignment dataset by masking triplet elements and prompting an LLM to generate corresponding queries, and optimize two contrastive objectives for pair-level and element-level matching.
At inference time, KAMR retrieves anchors globally and expands locally to return connected evidence.
Across two benchmarks, three LLM backbones, and fourteen baselines, KAMR consistently improves multi-hop retrieval and downstream question answering tasks.
Paper Type: Long
Research Area: Retrieval-Augmented Language Models
Research Area Keywords: Generation, Information Retrieval and Text Mining
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis
Languages Studied: English
Submission Number: 4417
Loading