RL from Physical Feedback: Aligning Large Motion Models with Humanoid Control

Junpeng Yue; Zepeng Wang; Jiangxing Wang; Yuxuan Wang; Ziluo Ding; Sipeng Zheng; Xinrun Xu; Yu Zhang; Bin Cao; Zongqing Lu

RL from Physical Feedback: Aligning Large Motion Models with Humanoid Control

Junpeng Yue, Zepeng Wang, Jiangxing Wang, Yuxuan Wang, Ziluo Ding, Sipeng Zheng, Xinrun Xu, Yu Zhang, Bin Cao, Zongqing Lu

16 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Text-to-motion generation, Humanoid whole body control, Reinforcement Learning fine-tune of LLM

Abstract: This paper focuses on a critical challenge in robotics: translating text-driven generated human motions into executable actions for humanoid robots. While existing text-to-motion generation methods achieve semantic alignment between language and motion, they often produce kinematically or physically infeasible motions unsuitable for real-world deployment. To bridge the gap between motion generation and humanoid execution, we propose \textbf{Reinforcement Learning from Physical Feedback (RLPF)}, a novel framework that integrates physics-aware motion evaluation with text-conditioned motion generation. RLPF employs a motion tracking policy to assess feasibility in a physics simulator, generating rewards for fine-tuning the motion generator. Furthermore, RLPF introduces an alignment verification module to preserve semantic alignment to text instructions. This joint optimization ensures both physical feasibility and instruction alignment. Extensive experiments show that RLPF greatly outperforms baseline methods in generating physically feasible motions while maintaining semantic alignment with text instruction, enabling successful deployment on real humanoid robots. More visualizations are available at \href{https://anonymous.4open.science/r/RLPF/}{https://anonymous.4open.science/r/RLPF/}.

Primary Area: applications to robotics, autonomy, planning

Submission Number: 6932

Loading