DynamicPoseNet: Advanced Human Motion Generation with Dual-Pathway CNNs and LoRA-Enhanced LLaMA

Zundong Wu; Jin Xu

DynamicPoseNet: Advanced Human Motion Generation with Dual-Pathway CNNs and LoRA-Enhanced LLaMA

Zundong Wu, Jin Xu

Published: 19 Mar 2024, Last Modified: 29 Jul 2024ICLR 2024 TinyPapers Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Motion Generation, Human 3D, Contrastive Learning

TL;DR: DynamicPoseNet efficiently synthesizes realistic human motions, controlled by multiple inputs including textual descriptions and keyframe poses.

Abstract: This study introduces DynamicPoseNet, a novel convolutional neural network architecture leveraging depthwise separable convolutions and dual-pathway feature extraction for advanced human motion generation. Using large-scale datasets such as HumanML3D and KIT-ML, DynamicPoseNet efficiently synthesizes realistic human motions, controlled by multiple inputs including textual descriptions and keyframe poses. The model, fine-tuned from a pre-trained 13B LLaMA with LoRA and contrastive learning adaptation, demonstrates superior performance in terms of quality and diversity of generated motions, outperforming state-of-the-art methods with significantly reduced training time and computational resources. Our results indicate a promising direction for future research in diverse and realistic motion generation using advanced deep learning techniques.

Submission Number: 212

Loading