Light-Weight Hallucination Detection using Contrastive Learning for Conditional Text Generation

Published: 22 Jun 2025, Last Modified: 22 Jun 2025ACL-SRW 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, hallucination, safety, hallucination detection
Abstract: We propose a simple and light-weight, yet effective hallucination detection method for conditional text generation. Hallucinated outputs include information that is either absent from and/or difficult to infer from the input context. Leveraging this feature, we add contrastive learning to the hallucination detection classifier to pull faithful outputs and input contexts together while pushing hallucinated outputs apart. Experimental results confirm that our method on top of RoBERTa improves binary hallucination detection performance, outperforming much larger GPT-$4$o prompting. Remarkably, our method shows higher performance for outputs where hallucinated spans are sparse.
Archival Status: Archival
Paper Length: Short Paper (up to 4 pages of content)
Submission Number: 155
Loading