Motion-R1: Latent-Intent Motion Generation with Physical Consistency

ICLR 2026 Conference Submission12756 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Robot Motion Primitive Generation
TL;DR: Motion-R1 uses RL to generate human-aligned motions with improved reasoning, leveraging a new dataset and JS-divergence optimization.
Abstract: Human motion synthesis serves as a foundational component in computer graphics, embodied AI, and robotics. Despite progress has elevated motion quality and physical plausibility, prevailing methods remain constrained by their reliance on explicit and hand-crafted control cues. More importantly, they rarely exhibit the capacity to infer users' implicit intentions—posing a major barrier to human-aligned motion generation. Inspired by DeepSeek-R1's success in eliciting reasoning abilities through rule-based reinforcement learning (RL), we propose Motion-R1 as the first attempt to explore the R1 paradigm for physically consistent latent-intent motion generation. However, the naïve adoption of Group Relative Policy Optimization (GRPO) to motion synthesis encounters two limitations: (1) the scarcity of motion-reasoning dataset, and (2) a lack of motion reasoning abilities. Towards these issues, we first construct a newly curated Motion2Motion benchmark dataset, comprising text-to-motion dialogues for RL training. Further, our proposed Motion-R1 integrates a JS-divergence constrained policy optimization, achieving improved reasoning capabilities on both motion generation and mathematical computation benchmarks. In addition, we utilize a low-level RL-based optimization strategy to enforce strict adherence to kinematic constraints. Experimental results showcase that Motion-R1 delivers contextually appropriate, lifelike motions and surpasses strong baselines in both accuracy and interpretability.
Primary Area: reinforcement learning
Submission Number: 12756
Loading