Structured Exploration in Reinforcement Learning by Hypothesizing Linear Temporal Logic Formulas

Published: 24 Oct 2024, Last Modified: 27 Nov 2024LEAP 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LTL, Reinforcement Learning
TL;DR: We tackle the problem of long horizon reinforcement learning through structured, hierarchical exploration in the linear temporal logic task space.
Abstract: Exploration in vast domains is a core challenge in reinforcement learning (RL). Existing methods commonly explore by adding noise to the learning process, but they do not scale to complex, long-horizon problems. Goal-based exploration is a promising alternative, but it requires useful goals. We propose an approach that structures an agent's exploration by constraining the goal space to tasks that can be expressed using a particular formal language: linear temporal logic (LTL). Our agent proposes LTL expressions that it conjectures to be achievable and desirable for maximizing its learning progress in the environment. Upon proposing an LTL expression, the agent uses a combination of planning and goal-conditioned RL to solve the task described by that LTL. The result is a structured exploration process that learns about the environment by hypothesizing various logical and sequential compositions of atomic goals. We demonstrate the performance of our algorithm outperforms in two challenging sparse-reward problems.
Submission Number: 55
Loading