Keywords: Diffusion, MCTS, Long-term Planning, Offline RL, Goal-conditioned RL, Stitching sub-plans
TL;DR: C-MCTD enables diffusion planners to generate plans 10× longer than training examples by systematically stitching together shorter plans through tree search.
Abstract: Monte Carlo Tree Diffusion (MCTD) integrates diffusion models with structured tree search to enable effective trajectory exploration through stepwise reasoning. However, MCTD remains fundamentally limited by training trajectory lengths. While periodic replanning allows plan concatenation for longer plan generation, the planning process remains locally confined, as MCTD searches within individual trajectories without access to global context. We propose Compositional Monte Carlo Tree Diffusion (C-MCTD), a framework that elevates planning from individual trajectory optimization to reasoning over complete plan compositions. C-MCTD introduces three complementary components: (1) Online Composer, which performs globally-aware planning by searching across entire plan compositions; (2) Distributed Composer, which reduces search complexity through parallel exploration from multiple starting points; and (3) Preplan Composer, which accelerates inference by leveraging cached plan graphs.
Supplementary Material: zip
Primary Area: Reinforcement learning (e.g., decision and control, planning, hierarchical RL, robotics)
Submission Number: 19341
Loading