REPANA: Reasoning Path Navigated Program Induction for Universally Reasoning over Heterogeneous Knowledge Bases
Abstract: Program induction is a typical approach that helps Large Language Models (LLMs) in complex knowledge-intensive question answering over knowledge bases (KBs) to alleviate the hallucination of LLMs. However, accurate program induction requires extensive high-quality parallel data for a specific KB, which is scarce for low-resource KBs. Moreover, the heterogeneity of questions and KB schemas limits the transferability of models trained on a single dataset. To this end, we propose REPANA, a reasoning path navigated program induction framework that enables LLMs to reason over heterogeneous KBs. We decouple the program induction into perceiving the KB and
mapping questions to program sketches. Accordingly, our framework consists of (1) an LLM-based navigator, which retrieves reasoning paths of the input question from the given KB; (2) and a KB-agnostic parser trained on multiple heterogeneous datasets, which takes the retrieved paths and the question as input and generates the corresponding program. Experiments show that REPANA exhibits strong generalization and transferability. It can directly perform inference on datasets not seen during training, outperforming other SoTA low-resource methods, even approaching the performance of supervised methods.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: knowledge base QA, multihop QA, parameter-efficient-training, data augmentation, NLP in resource-constrained settings
Contribution Types: Approaches to low-resource settings
Languages Studied: English
Submission Number: 3006
Loading