Evaluating LLM Planning in Partially Observable Environments via Observation Representations and Action Sequences

Published: 23 Sept 2025, Last Modified: 22 Nov 2025LAWEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Sequential Decision Making, LLM-as-agents
Abstract: Recent evaluation of large language models (LLMs) has increasingly shifted from static, single-turn benchmarks to interactive environments that demand sequential decision-making, long-term planning, and adaptation. LLM-as-agents show strong potential in these settings, leveraging broad pretraining for generalizable planning and offering more interpretability than traditional reinforcement learning methods. However, their core reasoning abilities remain contested, with evidence of limitations in logical consistency and a tendency toward pattern matching over causal inference. To probe these challenges, we study LLM planning in partially observable environments that require reasoning under uncertainty. We propose two strategies to assess and enhance their capabilities: (i) evaluating three types of observation representations: natural language, structured symbolic, and a hybrid format that combines both; and (ii) prompting LLMs to generate extended action sequences per decision step to exploit their long-horizon planning capacity. These approaches aim to clarify the extent to which LLMs can reason, plan, and act effectively in the face of partial observability. Our code is available at:~\url{https://anonymous.4open.science/r/llm-planning-po-ED74/}.
Submission Type: Research Paper (4-9 Pages)
Submission Number: 33
Loading