Investigating the Link Between Factual Prevalence and Hallucinations in LLMs via Academic Author Prediction Tasks

ACL ARR 2025 May Submission6049 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Language Models (LLMs) like GPT-4 are widely used for question answering but are prone to hallucinations. Fact-conflicting hallucinations, which contradict established knowledge, are especially concerning in domains like scientific research. While detection has been studied, the causes, particularly the role of factual prevalence, remain underexplored. In this work, we hypothesize that hallucinations are more likely for less prevalent topics. Using citation count as a proxy for prevalence, we curated a Q\&A dataset of 4,000 papers across four disciplines and prompted GPT-4-turbo to predict authorship. Responses were evaluated using self-assessment under two definitions of hallucination. Our analysis shows a general inverse correlation between hallucination rate and citation count, with the strongest trend under a narrow definition of hallucination, for most of the disciplines.
Paper Type: Short
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: large language models, hallucination
Contribution Types: Model analysis & interpretability
Languages Studied: English, language-agnostic
Submission Number: 6049
Loading