Keywords: llm, generative models, language modeling, nlg, uncertainty estimation, aleatoric uncertainty, semantic entropy, importance sampling
TL;DR: We introduce SDLG, an efficient technique for accurately estimating aleatoric semantic uncertainty to detect LLM hallucinations.
Abstract: Large language models (LLMs) suffer from hallucination, where they generate text that is not factual. Hallucinations impede many applications of LLMs in society and industry as they make LLMs untrustworthy. It has been suggested that hallucinations result from predictive uncertainty. If an LLM is uncertain about the semantic meaning it should generate next, it is likely to start hallucinating. We introduce Semantic-Diverse Language Generation (SDLG) to quantify predictive uncertainty of LLMs. Our method detects if a generated text is hallucinated by offering a precise measure of aleatoric semantic uncertainty. Experiments demonstrate that SDLG consistently outperforms existing methods while being computationally the most efficient, setting a new standard for uncertainty estimation in NLG.
Submission Number: 32
Loading