Goal-Directed Behaviour and its Implications for Superintelligence

Published: 01 Mar 2026, Last Modified: 03 Mar 2026P-AGIEveryoneRevisionsBibTeXCC BY 4.0
Track: Track 2: Socio-Economical and Future Visions
Keywords: superintelligence, goal directed behaviour
Abstract: Goal-directed behaviour is increasingly central to debates about advanced AI and prospective superintelligence: systems that robustly encode preferred states and pursue them across contexts can become difficult to predict, intervene on, and govern. This paper develops an informational-functional framework for analysing goal-directedness across a continuum from simple dynamical systems to biological organisms and contemporary AI (including large language models and embodied agents). We propose operational criteria and intervention-based tests for distinguishing mere appearance of goals from mechanisms that genuinely store and stabilise goal information, and we argue that these distinctions are directly relevant to safety: they clarify when agentive descriptions are warranted, what it would take to build trustworthy agents, and where responsibility for outcomes should lie in sociotechnical systems. We conclude with a research programme aimed at measuring, locating, and controlling goal representations in increasingly capable AI systems.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 41
Loading