Keywords: Textual Graph Question Answering, LLM-based Retrieval, GraphRAG
Abstract: Textual graph question answering (GraphQA) and graph-based retrieval-augmented generation (GraphRAG) have gained increasing attention as a way to ground large language models (LLMs) on structured knowledge.
Despite extensive efforts, existing methods suffer from two fundamental limitations.
First (coverage), retrievers may fail to recall the evidence nodes required for reasoning, resulting in insufficient coverage of the gold answer set.
Second (compactness), naive attempts to boost recall typically lead to an explosion of retrieved subgraph size, introducing excessive irrelevant information that compromises compactness and overwhelms the LLM.
The gold answer set provides critical supervision in retrieving relevant information while filtering out irrelevant ones, which however, is rarely exploited by existing methods.
In this work, we propose TALENT, a learnable retriever that utilizes gold answer signals to enhance both retrieval coverage and compactness.
Specifically, we stratify graph nodes into three distinct levels: gold answers, answer-related nodes, and irrelevant noise.
Afterwards, we adopt a weighted pseudo-label loss to prioritize gold answers and preserve answer-related nodes for coverage, while discarding irrelevant noises for compactness.
Experimental results across two datasets in GraphQA benchmarks demonstrate that our approach consistently improves downstream performance.
Paper Type: Long
Research Area: Retrieval-Augmented Language Models
Research Area Keywords: Retrieval-Augmented Language Models
Contribution Types: NLP engineering experiment, Data analysis
Languages Studied: English
Submission Number: 7799
Loading