Not All Answers Are Contextually Persuadable: Inference Dynamics in Large Language Models under Contextual Influence
Abstract: At the core of modern prompting techniques is contextual sensitivity, the ability of large language models to adapt their predictions based on inference-time context. Despite its central role, inference behavior under strong contextual influence remains poorly understood, particularly at the level of internal inference dynamics. To bridge this gap, we introduce a theoretical framework for analyzing contextual influence through inference dynamics, enabling quantitative characterization of inference behavior beyond output-level answer changes. Our analysis shows that inference dynamics do not exhibit unbounded drift under repeated contextual assertions. Instead, predictive representations converge to stable, query-dependent regimes that fundamentally constrain whether contextual signals can alter a model’s prediction. This leads to a surprising finding: Repeated contextual assertions do not act as accumulating evidence during inference and may therefore fail to alter a model’s prediction even under unbounded repetition, while in other cases a prediction change becomes inevitable. We empirically validate our theoretical predictions across diverse models and tasks, demonstrating strong alignment between theory and observed inference behavior. These contributions offer a principled pathway toward characterizing the limits of contextual influence during inference, and provide practical implications for designing and evaluating repetition-based prompting methods.
Lay Summary: Large language models can change their answers when a prompt repeats the same claim many times. This matters because repetition can be used to guide a model, but it can also make a model follow misleading information. A common assumption is that every repetition acts like another piece of evidence, so repeating a claim enough times should eventually force the model to accept it. We show that this assumption is wrong. We study a controlled setting where a question is followed by the same answer-like statement repeated many times, and we examine both the final answer and the hidden signals the model uses to form that answer. Our analysis shows that these hidden signals do not drift endlessly as repetition increases; instead, they settle into a stable limit that depends on the question and the model. This explains why some answers flip immediately, some flip only after many repetitions, and others do not flip even with unlimited repetition. We validate these predictions across multiple language models and question-answering benchmarks. Our results clarify when prompt repetition can influence model behavior and can help build more reliable evaluations and safer prompting strategies.
Primary Area: Deep Learning->Attention Mechanisms
Keywords: Large Language Models, Contextual Repetition, Convergence Analysis
Originally Submitted PDF: pdf
Submission Number: 5846
Loading