Oﬄine Reinforcement Learning with Bayesian Flow Networks

Oﬄine Reinforcement Learning with Bayesian Flow Networks

TMLR Paper3003 Authors

14 Jul 2024 (modified: 17 Sept 2024)Withdrawn by AuthorsEveryoneRevisionsBibTeXCC BY 4.0

Abstract: This paper presents a novel approach to reinforcement learning (RL) utilizing Bayesian flow networks for sequence generation, enabling effective planning in both discrete and continuous domains by conditioning on returns and current states. We explore two conditioning strategies: state inpainting and a classifier-free method. Experimental results demonstrate the robustness of our method across various environments. It adeptly navigated gridworld environments in discrete settings, without sacrificing performance in continuous tasks compared to current state of the art . The results highlight our approach's ability to effectively capture spatial and temporal dependencies through a specialized neural network architecture combining 2D convolutions with a temporal u-net.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Andrew_Kyle_Lampinen1

Submission Number: 3003

Loading