Planning with Consistency Models for Model-Based Offline Reinforcement Learning

Guanquan Wang; Takuya Hiraoka; Yoshimasa Tsuruoka

Planning with Consistency Models for Model-Based Offline Reinforcement Learning

Guanquan Wang, Takuya Hiraoka, Yoshimasa Tsuruoka

Published: 12 Dec 2024, Last Modified: 12 Dec 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: This paper introduces consistency models to the problem of sequential decision-making. Previous work applying diffusion models to planning within a model-based reinforcement learning framework often struggles with high computational cost during the inference process, primarily due to their reliance on iterative reverse diffusion processes. Consistency models, known for their computational efficiency, have already shown promise in reinforcement learning within the actor-critic algorithm. Therefore, we combine guided consistency distillation with a continuous-time diffusion model in the framework of Decision Diffuser. Our approach, named Consistency Planning, combines the robust planning capabilities of diffusion models with the speed of consistency models. We validate our method on Gym tasks in the D4RL framework, demonstrating that, when compared to its diffusion model counterparts, our method achieves more than a 12-fold increase in speed without any loss in performance.

Submission Length: Regular submission (no more than 12 pages of main content)

Code: https://github.com/GuanquanWang/consistency_planning

Assigned Action Editor: ~Manuel_Haussmann1

Submission Number: 3315

Loading