Faltering on the Long-Tail: LLM Knowledge Stability Disparities and the Roles of Encoding Redundancy and Associative Memory

Faltering on the Long-Tail: LLM Knowledge Stability Disparities and the Roles of Encoding Redundancy and Associative Memory

ACL ARR 2025 May Submission5470 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) exhibit significant disparities in the stability of factual knowledge, particularly struggling with Long-Tail (LT) topics compared to dominant (DT) ones. This study introduces poison pills, a novel localized perturbation technique, to precisely quantify this differential stability. Our experiments consistently demonstrate that LT knowledge is substantially more susceptible to corruption than DT knowledge. We propose and experimentally validate two primary underlying mechanisms: encoding redundancy, where reduced redundancy in smaller or compressed models markedly heightens LT susceptibility; and associative memory, where the propagation of induced changes via conceptual links (``contamination contagion'') corroborates this mechanism and reveals a distinct susceptibility pattern in DT knowledge when associatively linked entities are jointly perturbed. These neuro-inspired findings offer crucial insights into LLM knowledge encoding, revealing intrinsic, type-specific vulnerabilities. Practically, our work uncovers critical robustness-efficiency trade-offs in model compression and informs pathways toward developing more broadly reliable LLMs.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: Interpretability and Analysis of Models for NLP, Language Modeling

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 5470

Loading