A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Actively Validating Low-Confidence Generation


16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: Addressing the crucial problem of LLMs pertaining to hallucinations, we propose an approach that actively detects and mitigates hallucinations during the generation process.
Abstract: Recently developed large language models (LLMs) have achieved remarkable success in generating fluent and coherent text. However, these models often tend to 'hallucinate' which critically hampers their reliability. In this work, we address this crucial problem and propose an approach that actively detects and mitigates hallucinations during the generation process. Specifically, we first identify the candidates of potential hallucination leveraging the model's logit output values, check their correctness through a validation procedure, mitigate the detected hallucinations via prompting, and then continue with the generation process. This active intervention also facilitates in preventing the propagation of hallucinations in the LLM's output. Through extensive experiments with GPT-3.5 (text-davinci-003) on the 'article generation task', we first show that the proposed approach successfully reduces the hallucinations from 47.5% to 14.5%. Then, we further demonstrate the effectiveness and wide applicability of our approach through additional experiments with different types of questions (multi-hop and false premise) and with another LLM from a different model family (Vicuna). In summary, our work contributes to improving the reliability and trustworthiness of LLMs, a crucial step en route to enabling their widespread adoption.
Paper Type: long
Research Area: Generation
Contribution Types: NLP engineering experiment
Languages Studied: English
