DynamicPoseNet: Advanced Human Motion Generation with Dual-Pathway CNNs and LoRA-Enhanced LLaMA

Published: 19 Mar 2024, Last Modified: 19 Mar 2024Tiny Papers @ ICLR 2024 PresentEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Motion Generation, Human 3D, Contrastive Learning
TL;DR: DynamicPoseNet efficiently synthesizes realistic human motions, controlled by multiple inputs including textual descriptions and keyframe poses.
Abstract: This study introduces DynamicPoseNet, a novel convolutional neural network architecture leveraging depthwise separable convolutions and dual-pathway feature extraction for advanced human motion generation. Using large-scale datasets such as HumanML3D and KIT-ML, DynamicPoseNet efficiently synthesizes realistic human motions, controlled by multiple inputs including textual descriptions and keyframe poses. The model, fine-tuned from a pre-trained 13B LLaMA with LoRA and contrastive learning adaptation, demonstrates superior performance in terms of quality and diversity of generated motions, outperforming state-of-the-art methods with significantly reduced training time and computational resources. Our results indicate a promising direction for future research in diverse and realistic motion generation using advanced deep learning techniques.
Submission Number: 212
Loading