Planning with Consistency Models for Model-Based Offline Reinforcement Learning

TMLR Paper3315 Authors

09 Sept 2024 (modified: 06 Nov 2024)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: This paper introduces consistency models in the problem of sequential decision-making. Previous work applying diffusion models to planning within a model-based reinforcement learning framework often struggles with high computational cost during the inference process, due to its reliance on iterative reverse diffusion processes. Consistency models, known for their computational efficiency, have already shown promise in reinforcement learning within the actor-critic algorithm. Therefore, we combine guided consistency distillation with a continuous-time diffusion model in the framework of Decision Diffuser. Our approach, named Consistency Planning, combines the robust planning capabilities of diffusion models with the speed of consistency models. We validate our method on gym tasks in the D4RL framework, demonstrating that compared with its diffusion model counterparts, our method achieves more than 12-fold increase in speed without any loss in performance.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Manuel_Haussmann1
Submission Number: 3315
Loading