Goal-Directedness is in the Eye of the Beholder

11 May 2025 (modified: 29 Oct 2025)Submitted to NeurIPS 2025 Position Paper TrackEveryoneRevisionsBibTeXCC BY 4.0
Keywords: goal-directedness, intentionality, causal modeling, mechanistic interpretability
TL;DR: This paper argues that goal-directedness cannot be measured objectively, and outlines new directions for modeling goal-directed behavior without explicit goal representation, and instead emerging from dynamic interaction.
Abstract: Our ability to predict the behavior of complex agents turns on the attribution of goals. Probing for goal-directed behavior comes in two flavors: Behavioral and mechanistic. The former proposes that goal-directedness can be estimated through behavioral observation, whereas the latter attempts to probe for goals in internal model states. We work through the assumptions behind both approaches, identifying technical and conceptual problems that arise from formalizing goals in agent systems. We arrive at the perhaps surprising position that goal-directedness cannot be measured objectively. We outline new directions for modeling goal-directedness as an emergent property of dynamic, multi-agent systems.
Submission Number: 192
Loading