Harder is Better: Hard Hallucination-Induced Contrastive Decoding for Hallucination Mitigation

ACL ARR 2024 December Submission2339 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract:

Large Language Models (LLMs) have achieved significant advancements in various natural language processing tasks. However, they are susceptible to generating hallucinations—fabricated or inaccurate statements presented as factual information—which can undermine their reliability in high-stakes applications. To address this issue, we propose a new inference-stage HiCD method to improve hallucination mitigation. It aims to inject hard-to-detect hallucinations to enhance the robustness of contrastive decoding during inference. An adversarial-aware strategy is introduced for finetuning hallucination models to effectively learn more precise and diverse hallucination patterns from available hallucination data.This enhances the contrastive decoding process, enabling more effective identification and filtering of erroneous content. We evaluate HiCD on four various hallucination benchmarks. Experimental results show significant improvements on all metrics consistently, proving the effectiveness and superiority of HiCD for hallucination mitigation.

Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: security and privacy,hardness of samples,inference methods
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 2339
Loading