When Do LLMs Listen? Confidence-Guided Knowledge Acceptance in LLMs

ICLR 2026 Conference Submission20763 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models (LLMs), Knowledge Graphs (KGs), In-Context Learning (ICL), Model Confidence, Knowledge-Augmented Reasoning
TL;DR: We study how LLMs respond to Knowledge Graph injections during in-context learning. High-confidence predictions resist rival or noisy knowledge, while lower-confidence ones are more influenced, offering insights into LLM robustness and design.
Abstract: Large Language Models (LLMs) exhibit remarkable performance across a wide range of natural language tasks but face limitations in accessing dynamic or domain-specific knowledge not encountered during pre-training. To address this, recent research has explored integrating structured external knowledge from Knowledge Graphs (KGs) into LLMs via in-context learning (ICL). Although KG-augmented in-context reasoning has shown strong performance on commonsense tasks such as multiple-choice question answering (MCQA), the extent of LLMs’ dependence on external knowledge remains poorly understood. Prior work has primarily examined which knowledge to extract from KGs and how to represent it to optimize prompts and task accuracy. In this study, we shift the focus to if and when LLMs accept or resist injected knowledge. We introduce a confidence-guided framework that stratifies model predictions into high-, moderate-, and low-certainty bands: high means the model assigns dominant probability to a single choice, moderate reflects a few competing options with similar weight, and low corresponds to diffuse distributions with no clear preference. We then examine how knowledge interventions reshape probability distributions over candidate answers. Interventions include (i) supportive knowledge reinforcing the model’s initial choice, (ii) rival knowledge aligned with alternative answers, and (iii) noisy off-topic statements. Our analysis reveals systematic patterns: highly confident predictions largely resist rival or noisy evidence, whereas moderate- and low-confidence predictions are more susceptible to shifts when exposed to rival information. The model is most likely to switch between answer choices assigned similar mid-level confidence, while low-confidence options may gain probability mass but rarely enough to overturn the final decision. Noisy knowledge, however, induces only minor changes in the confidence distribution across choices. By moving beyond accuracy to behavioral response, this work provides a principled view of LLM robustness to knowledge augmentation and highlights design considerations for effective KG-enhanced question answering.
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Submission Number: 20763
Loading