Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

Published: 01 Mar 2026, Last Modified: 01 Mar 2026AI4PeaceEveryoneRevisionsCC BY 4.0
Track: previously published paper
Keywords: historical revisionism, llm evaluation
Abstract: Large language models (LLMs) are increasingly consulted for historical information by citizens, journalists, and institutions, raising concerns about their tendency to reproduce or amplify historical revisionism: the distortion, omission, or reframing of established facts. We introduce \texttt{HistoricalMisinfo}, a curated dataset of $500$ contested events from $45$ countries, each paired with factual and revisionist narratives. To approximate real-world dissemination, we design $11$ prompt scenarios per event, capturing diverse ways historical content is elicited and framed. Using this benchmark, we evaluate multiple medium-sized LLMs and find systematic vulnerabilities: the prevalence of revisionist outputs varies across models, countries, and prompt types. \texttt{HistoricalMisinfo} provides a practical foundation for auditing the reliability of generative systems and for developing safeguards against the spread of revisionist narratives.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 7
Loading