Evaluating Self-Orienting in Language and Reasoning Models

Published: 10 Jun 2025, Last Modified: 14 Jul 2025ICML 2025 World Models WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Self-Representation, Large Language Models, Computational Cognitive Science
Abstract: We present a novel evaluation approach based on research in cognitive science, which studies the ability of an agent to self-orient (i.e., identify what problem it is solving and which agent it is in the environment). Our task involves a grid-world where the agent needs to navigate to a goal, but does not have prior knowledge of the world, including which entity it controls. Humans solve this task in a two-step process, first figuring out what agent they control, in other words self-orienting, and then navigating to the goal. We ask whether LLMs can accomplish this task. We found that state-of-the-art LLMs (GPT-4o) have the ability to efficiently self-orient with near-optimal performance, but this ability disappears with in-context reasoning (OpenAI o4-mini). However, we find that this ability reemerges for reasoning models trained with more advanced methods, such as backtracking (o3).
Submission Number: 30
Loading