everyone
since 13 Oct 2023">EveryoneRevisionsBibTeX
In this study, we introduce a learning-based method for generating high-quality human motion sequences from text descriptions (e.g., ``A person walks forward"). Existing techniques struggle with motion diversity and smooth transitions due to limited text-to-motion datasets and reliance on full-body skeletal pose representations. To address this, we develop a network encoder that converts motion sequences into periodic signals, capturing the local periodicity of motions in time and space. We also propose a conditional diffusion model for predicting periodic motion parameters based on text descriptions and the starting pose. Our approach outperforms current methods, generating a broader variety of high-quality motions with natural transitions, especially in longer sequences.