Mitigating Conversational Inertia in Multi-Turn Agents through Context Bias Calibration

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM Agent; few-shot learning; Long-context; Multi-turn; reward-free
TL;DR: We identify conversational inertia as a limitation in multi-turn agents and introduce Context Bias Calibration framework to mitigate imitation bias and enhance agentic performance.
Abstract: Large language models excel as few-shot learners when provided with appropriate demonstrations, yet this strength becomes problematic in multi-turn agent scenarios where excessive mimicry of previous interactions undermines agentic exploration. We identify the root cause as conversational inertia—a phenomenon where models exhibit strong diagonal attention to previous assistant responses, creating imitation bias that constrains exploration. This phenomenon manifests prominently at moderate context lengths (e.g., 4K tokens) and worsens with longer conversations, explaining why agent performance degrades well before reaching the model's context limits. Through attention analysis, we find that models increasingly focus on previous responses while attention to task instructions shows marginal change, disrupting the exploration-exploitation balance for agents. We propose Context Bias Calibration as a unified framework to mitigate inertia. Our approach operates through two complementary mechanisms: clip context periodically clears interaction history, and Context Preference Learning that calibrates model preferences to favor responses generated with shorter contexts over those from longer contexts, using their own outputs without environment rewards. Experimental results across eight diverse environments demonstrate that Context Bias Calibration Framework reduces conversational inertia and achieves performance improvements.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 10426
Loading