Regularized Contrastive Decoding with Hard Negative Samples for Hallucination Mitigation

ACL ARR 2025 May Submission6879 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Language Models have achieved significant advancements in various natural language processing tasks. However, they are susceptible to generating hallucinations-fabricated or inaccurate statements presented as factual information-which can undermine their reliability in high-stakes applications. To address this issue, we propose a new inference-stage hallucination mitigation method, Regularized Contrastive Decoding (RCD), to exploit hard negative samples for improving the robustness of contrastive decoding. Additionally, we design a new adversarial-aware regularization term to finetune hallucination models to learn more challenging and diverse hallucination patterns from available data with the guidance of adversarial perturbations. This enhances the contrastive decoding process, enabling more effective identification and filtering of erroneous content. We conduct experiments on four public hallucination benchmarks. Experimental results show our method achieves better hallucination mitigation performance consistently, proving the effectiveness and superiority of RCD for hallucination mitigation.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: factuality,hardness of samples,inference methods
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 6879
Loading