MoFA: Dual-Task Motion Factorization for Human Motion Synthesis with Imperfect and Limited Data

Yingjie Xi; Xingyu Ye; Pengjie Wang; Boyuan Cheng; Xiaosong Yang; Jian Jun Zhang

MoFA: Dual-Task Motion Factorization for Human Motion Synthesis with Imperfect and Limited Data

Yingjie Xi, Xingyu Ye, Pengjie Wang, Boyuan Cheng, Xiaosong Yang, Jian Jun Zhang

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Human Motion Synthesis, Keyframe-Guided Generation, Trajectory Conditioning, Robustness, Diffusion Models

TL;DR: We present MoFA, a diffusion-based motion factorization framework that jointly leverages keyframes and trajectories through LMRS, TAMI, and NPAT to generate realistic, consistent, and controllable human motions.

Abstract: Human motion synthesis has recently benefited from diffusion models, achieving unprecedented realism and diversity. Yet precise and controllable generation remains challenging: text, audio, and 2D cues are often ambiguous, while existing trajectory-keyframe approaches suffer from limited generalization, naive feature fusion, and poor robustness to unpaired control signals. We identify this bottleneck as the entanglement between keyframe and trajectory signals, which are inherently coupled in training but frequently mismatched at inference. To address this, we propose MoFA, a diffusion-based Motion Factorization framework that decomposes synthesis into two complementary sub-tasks: (i) Local Motion Completion, focusing on keyframe dynamics, and (ii) Trajectory Adaptation, ensuring global spatial consistency. MoFA integrates the Local Motion Refinement Stack (LMRS) and the Trajectory-Aware Motion Integration (TAMI) to jointly refine local poses and adapt them to trajectories. In addition, we introduce a Quality-Aware Dual Training (QADT) strategy that leverages imperfect or low-quality data as auxiliary supervision, substantially expanding the effective training set and improving generalization. Extensive experiments demonstrate that MoFA achieves more stable, controllable, and robust motion synthesis than advanced baselines.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 11051

Loading