Keywords: Action Model Learning, Neuro-symbolic AI, Computer Vision
TL;DR: We extract action models for planning domains from visual traces via probabilistic neuro-symbolic learning.
Abstract: Model-based planners rely on action models to describe available actions in terms of their preconditions and effects. Nonetheless, manually encoding such models is challenging, especially in complex domains. Numerous methods have been proposed to learn action models from examples of plan execution traces. However, high-level information, such as state labels within traces, is often unavailable and needs to be inferred indirectly from raw observations. In this paper, we aim to learn lifted action models from visual traces --- sequences of image-action pairs depicting discrete successive trace steps. We present ROSAME, a differentiable neu$\textbf{RO}$-$\textbf{S}$ymbolic $\textbf{A}$ction $\textbf{M}$odel l$\textbf{E}$arner that infers action models from traces consisting of probabilistic state predictions and actions. By combining ROSAME with a deep learning computer vision model, we create an end-to-end framework that jointly learns state predictions from images and infers symbolic action models. Experimental results demonstrate that our method succeeds in both tasks, using different visual state representations, with the learned action models often matching or even surpassing those created by humans.
Primary Keywords: Learning
Category: Long
Student: Graduate
Supplemtary Material: pdf
Submission Number: 41
Loading