Planning with Consistency Models for Model-Based Offline Reinforcement Learning

Guanquan Wang; Takuya Hiraoka; Yoshimasa Tsuruoka

Planning with Consistency Models for Model-Based Offline Reinforcement Learning

Guanquan Wang, Takuya Hiraoka, Yoshimasa Tsuruoka

Published: 01 Jun 2024, Last Modified: 07 Aug 2024Deployable RL @ RLC 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: consistency models, trajectory generation, diffusion model

Abstract: This paper introduces consistency models in the problem of sequential decision-making. Previous work applying diffusion models to planning within a model-based reinforcement learning framework often struggles with high computational cost during the inference process, due to its reliance on iterative reverse diffusion processes. Consistency models, known for their computational efficiency, have already shown promise in reinforcement learning within the actor-critic algorithm. Therefore, we combine guided consistency distillation with a continuous-time diffusion model in the framework of Decision Diffuser. Our approach, named Consistency Planning, combines the robust planning capabilities of diffusion models with the speed of consistency models. We validate our method on gym tasks in the D4RL framework, demonstrating that compared with its diffusion model counterparts, our method achieves more than 12-fold increase in speed without any loss in performance.

Submission Number: 5

Loading