Keywords: LLM Calibration, Decision making, Overconfidence, In-context learning, LLM Agents, LLM self-knowledge, AI Safety
TL;DR: We find that LLMs are overconfident in predicting their success on tasks, but some learn from in-context experiences to make more risk-averse decisions about which tasks to attempt.
Abstract: We investigate whether large language models (LLMs) can predict whether they will succeed on a given task, and whether their predictions improve as they progress through multi-step tasks. We also investigate whether LLMs can learn from in-context experiences to make better decisions about whether to pursue a task in scenarios where failure is costly. All LLMs we tested are overconfident, but most have somewhat better-than-random discriminatory power at distinguishing tasks they can and cannot accomplish. On multi-step agentic tasks, the overconfidence of several frontier LLMs worsens as they progress through the tasks. With in-context experiences of failure, most LLMs only slightly reduce their overconfidence, though in a resource acquisition scenario several LLMs (Claude Sonnet models and GPT-4.5) improve their performance by increasing their risk aversion. These results suggest that current LLM agents are hindered by their lack of awareness of their own capabilities.
Submission Number: 144
Loading