Controllable Motion Generation via Diffusion Modal Coupling

Luobin Wang, Hongzhan Yu, Chenning Yu, Sicun Gao, Henrik I Christensen

Published: 03 Mar 2025, Last Modified: 29 Apr 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Diffusion models have recently gained significant attention in robotics due to their ability to generate multimodal distributions of system states and behaviors. However, a key challenge remains: ensuring precise control over the generated outcomes without compromising realism. This is crucial for applications such as motion planning or trajectory forecasting, where adherence to physical constraints and taskspecific objectives is essential. We propose a novel framework that enhances controllability in diffusion models by leveraging multi-modal prior distributions and enforcing strong modal coupling. This allows us to initiate the denoising process directly from distinct prior modes that correspond to different possible system behaviors, ensuring sampling to align with the training distribution. We evaluate our approach on motion prediction using the Waymo dataset and multi-task control in Maze2D environments. Experimental results show that our framework outperforms both guidance-based techniques and conditioned models with unimodal priors, achieving superior fidelity, diversity, and controllability, even in the absence of explicit conditioning. Overall, our approach provides a more reliable and scalable solution for controllable motion generation in robotics