A Requirements Engineering-Driven Methodology for Planning Domain Generation via LLMs with Invariant-Based Refinement

Angelo Casciani; Giuseppe De Giacomo; Andrea Marrella; Christoph Weinhuber

A Requirements Engineering-Driven Methodology for Planning Domain Generation via LLMs with Invariant-Based Refinement

Angelo Casciani, Giuseppe De Giacomo, Andrea Marrella, Christoph Weinhuber

Published: 03 Sept 2025, Last Modified: 03 Sept 2025LM4PlanEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Automated Planning, PDDL, Large Language Model, Requirements Engineering, State Invariant

TL;DR: This paper presents a structured LLM-driven methodology for generating common-sense-aware PDDL planning domains from natural language scenarios and user stories, grounded in Requirements Engineering principles.

Availability: I can only present remotely

Abstract: The creation of planning models remains a labor-intensive and error-prone bottleneck in the deployment of Automated Planning solutions in real-world environments. Recent approaches have explored the use of Large Language Models (LLMs) as modelers to generate PDDL specifications from natural language. However, most rely on single-shot generation or limited feedback loops, often failing to enforce semantic correctness and common-sense coherence. In this paper, we present a structured, multi-stage methodology inspired by principles from Requirements Engineering for generating PDDL models. Our approach leverages scenarios and user stories to elicit domain and initial state, which are incrementally refined via syntax validation, state progression through plan execution, and invariants analysis to detect and correct modeling flaws. Indeed, the method's novelty lies in the integration of automated invariant extraction and LLM-guided model refinement, which significantly improves the semantic robustness and realism of the generated domains. We assume a human agent in the environment, enforcing real-world constraints such as limited hand capacity, and we evaluate our methodology on a synthetic dataset of room environments, using both expert judgment and an LLM-as-a-judge approach. Results show that the generated planning domains are syntactically correct, semantically sound, and actionable, enabling real-world deployment in physical agents.

Submission Number: 3

Loading