Insist when Know, Caution when Not Know? Unveiling LLMs' Fact-Checking Behavior Amidst Knowledge Conflicts
Keywords: Fact-checking, Knowledge Conflicts, LLMs, Faithfulness
Abstract: By augmenting large language models (LLMs) with external evidence, tool augmentation has become a prominent approach to addressing the limitations of their static parametric knowledge in fact-checking. However, the extent to which LLMs accept external evidence when it conflicts with their internal knowledge remains unclear. Moreover, do LLMs behave consistently when they parametrically \textsc{Know} versus \textsc{Not-Know} a specific claim? In this work, we introduce the first fine-grained evaluation framework to systematically probe LLMs’ fact-checking behavior under knowledge conflicts. Our experiments reveal that most LLMs resist conflicts from external evidence when confident (\textsc{Know}) but are more receptive when uncertain (\textsc{Not-Know}). We further demonstrate that some models (e.g., Gpt-4o-mini and Llama3-8B) achieve a better balance between openness to correct information and resistance to inaccurate evidence, whereas others (e.g., Deepseek-v3 and Gemini-2.5) tend to be either overcautious or overly credulous.
To address the challenge of balancing parametric and external knowledge, we propose a test-time algorithm based on explicit Jensen-Shannon Divergence computations over sampled prediction probabilities, enabling faithful arbitration between external evidence and parametric knowledge. Our method shows competitive performance against eight baselines on our constructed FactConf datasets, improving LLM-based factuality systems in knowledge conflicts.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 5026
Loading