everyone
since 05 Feb 2025">EveryoneRevisionsBibTeXCC BY 4.0
Large Language Models (LLMs) have achieved significant advancements in various natural language processing tasks. However, they are susceptible to generating hallucinations—fabricated or inaccurate statements presented as factual information—which can undermine their reliability in high-stakes applications. To address this issue, we propose a new inference-stage HiCD method to improve hallucination mitigation. It aims to inject hard-to-detect hallucinations to enhance the robustness of contrastive decoding during inference. An adversarial-aware strategy is introduced for finetuning hallucination models to effectively learn more precise and diverse hallucination patterns from available hallucination data.This enhances the contrastive decoding process, enabling more effective identification and filtering of erroneous content. We evaluate HiCD on four various hallucination benchmarks. Experimental results show significant improvements on all metrics consistently, proving the effectiveness and superiority of HiCD for hallucination mitigation.