Keywords: Motion Prior; Masked Motion Autoencoder; Motion Editing
Abstract: We present Latent Motion Prior (LaMP) , a novel framework for learning a generalizable human motion prior that enables efficient optimization for a wide range of motion-related tasks, including text-to-motion generation, motion editing, motion blending, motion refinement, and environment-aware collision avoidance. LaMP employs a body part-based encoder to learn a disentangled latent representation of human motion, together with a masked training strategy that encourages the model to capture the most informative structural and dynamic aspects of the motion. As a result, LaMP produces a robust and expressive latent space that serves as a latent motion prior across diverse downstream tasks. We evaluate the learned representation on a wide range of optimization-based downstream tasks. Experimental results show that the current families of text-to-motion models are generally not suitable to serve as a motion prior, while LaMP consistently outperforms the state-of-the-art methods across all optimization tasks. The code is available at: https://github.com/lvsean/LaMP.
Supplementary Material: zip
Submission Number: 344
Loading