HSRL: Hierarchical Spatial Reasoning with Large Language Model

06 Sept 2025 (modified: 05 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model, Spatial Reasoning, Hierarchical Reasoning
Abstract: Large language models (LLMs) have shown remarkable proficiency in general language understanding and reasoning. However, they consistently underperform in spatial reasoning, a crucial cognitive skill that severely limits their application, particularly in embodied intelligence. Inspired by the success of hierarchical learning in reinforcement learning, this paper introduces a novel method for hierarchical task decomposition in LLM spatial reasoning. Our approach leverages LLMs to break down complex spatial tasks at both the state and environment levels into more manageable sub-tasks. Specifically, we guide the LLM to identify a few key intermediate states, which are then used to generate simplified sub-environments between these key intermediate states. However, we observed that due to the LLM's lack of pre-training for spatial reasoning, it struggles to make optimal decisions during this decomposition process. To address this limitation and enhance its planning capability, we propose a novel algorithm: MCTS-Guided Group Relative Policy Optimization (M-GRPO). This algorithm integrates an MCTS-inspired exploration process and a modified, more fine-grained advantage function, enabling the model to learn optimal path planning. Experimental results demonstrate that our method substantially improves LLM performance on spatial tasks, including navigation, planning, and strategic games, achieving state-of-the-art results. This work paves the way for LLMs in real-world applications.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 2527
Loading