How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models
Abstract: Large language models (LLMs) exhibit strikingly conflicting behaviors: they can appear steadfastly
overconfident in their initial answers whilst at the same time being prone to excessive doubt when
challenged. To investigate this apparent paradox, we developed a novel experimental paradigm, exploiting the unique ability to obtain confidence estimates from LLMs without creating memory of their
initial judgments – something impossible in human participants. We show that LLMs – Gemma 3,
GPT4o and o1-preview – exhibit a pronounced choice-supportive bias that reinforces and boosts their
estimate of confidence in their answer, resulting in a marked resistance to change their mind. We
further demonstrate that LLMs markedly overweight inconsistent compared to consistent advice, in
a fashion that deviates qualitatively from normative Bayesian updating. Finally, we demonstrate that
these two mechanisms – a drive to maintain consistency with prior commitments and hypersensitivity
to contradictory feedback – parsimoniously capture LLM behavior in a different domain. Together, these
findings furnish a mechanistic account of LLM confidence that explains both their stubbornness and
excessive sensitivity to criticism.
Loading