Correcting Hallucinations in News Summaries: Exploration of Self-Correcting LLM Methods with External Knowledge

Correcting Hallucinations in News Summaries: Exploration of Self-Correcting LLM Methods with External Knowledge

ACL ARR 2025 February Submission3952 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: While large language models (LLMs) have shown remarkable capabilities to generate coherent text, they suffer from the issue of hallucinations -- factual inaccuracies. Self-correcting systems are especially promising for tackling hallucinations. They leverage the multi-turn nature of LLMs to iteratively generate verification questions inquiring additional evidence, answer them with internal or external knowledge, and use that to refine the original response with the new corrections. These methods have been explored for encyclopedic generation, but less so for domains like news summaries. In this work, we investigate two state-of-the-art self-correcting systems: apply them to hallucinated summaries, using three search engines, and evaluate. We analyze the results and provide qualitative insights into systems' performance, revealing interesting practical findings on G-Eval and human evaluation, and the benefits of search snippets and few-shot prompts.

Paper Type: Short

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking, automatic evaluation of datasets, evaluation, metrics

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 3952

Loading