Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

ACL ARR 2025 May Submission2841 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) are susceptible to generating hallucinated information, despite the integration of retrieval-augmented generation (RAG). Parallel context extension (PCE) is a line of research attempting to effectively integrating parallel (unordered) contexts, while it still suffers from in-context hallucinations when adapted to RAG scenarios. In this paper, we propose **DePaC** (**De**hallucinating **Pa**rallel **C**ontext Extension), which alleviates the in-context hallucination problem with **context-aware negative training** and **information-calibrated aggregation**. DePaC is designed to alleviate two types of in-context hallucination: **fact fabrication** (i.e., LLMs present claims that are not supported by the contexts) and **fact omission** (i.e., LLMs fail to present claims that can be supported by the contexts). Specifically, (1) for fact fabrication, we apply the context-aware negative training that fine-tunes the LLMs with negative supervisions, thus explicitly guiding the LLMs to refuse to answer when contexts are not related to questions; (2) for fact omission, we propose the information-calibrated aggregation which prioritizes context windows with higher information increment from their contexts. The experimental results on nine RAG tasks demonstrate that DePaC significantly alleviates the two types of in-context hallucination and consistently achieves better performances on these tasks.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: retrieval-augmented generation

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 2841

Loading