Unifying Inference-Time Planning Language Generation

Unifying Inference-Time Planning Language Generation

ACL ARR 2026 January Submission8061 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: applications, pddl, planning, program synthesis, LLM, Agents, neurosymbolic approaches, scaling, robustness

Abstract: A line of work in planning uses LLM not to generate a plan, but to generate a formal representation in some planning language, which can be input into a symbolic solver to deterministically find a plan. While showing improved trust and promising performance, dozens of recent publications have proposed scattered methods on a variety of benchmarks under different experimental settings. We attempt to unify the inference-time LLM-as-formalizer methodology for classical planning by proposing a unifying framework based on intermediate representations. We thus systematically evaluate more than a dozen pipelines that subsume most existing work, while proposing novel ones that involve syntactically similar but high-resource intermediate languages (such as a Python wrapper of PDDL). We provide recipes for planning language generation pipelines, draw a series of conclusions showing the efficacy of their various components, and evidence their robustness against problem complexity.

Paper Type: Long

Research Area: Low-resource Methods for NLP

Research Area Keywords: NLP Applications, Mathematical, Symbolic, Neurosymbolic, and Logical Reasoning, Low-resource Methods for NLP, Generalizability and Transfer,

Contribution Types: Approaches to low-resource settings, Data resources

Languages Studied: English, PDDL

Submission Number: 8061

Loading