Keywords: Motion generation, Motion Tracking & Transfer
TL;DR: A method to animate humanoid meshes from a text prompt by transferring motion generated by video diffusion models to the mesh.
Abstract: Animation of humanoid characters is essential in various graphics applications, but require significant time and cost to create realistic animations. We propose an approach to synthesize 4D animated sequences of input static 3D humanoid meshes, leveraging strong generalized motion priors from generative video models -- as such video models contain powerful motion information covering a wide variety of human motions. From an input static 3D humanoid mesh and a text prompt describing the desired animation, we synthesize a corresponding video conditioned on a rendered image of the 3D mesh. We then employ an underlying SMPL representation to animate the corresponding 3D mesh according to the video-generated motion, based on our motion optimization. This enables a cost-effective and accessible solution to enable the synthesis of diverse and realistic 4D animations
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 25066
Loading