AdaptControl: Adaptive Human Motion Control and Generation via User Prompt and Spatial Trajectory Guidance

Published: 01 Jan 2024, Last Modified: 14 May 2025HCMA@MM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the field of generative multimedia and interactive experiences, human motion generation guided by natural language and spatial signals has emerged as a critical research direction. However, generating human motions that precisely follow spatial trajectories while maintaining naturalness remains a significant challenge. Existing methods often struggle to achieve a balance between spatial control precision and motion realism. Moreover, the spatial guidance signals and the relative rotations of human limbs reside in different representation spaces, complicating the precise control of human motions along specified paths. To address these issues, we propose AdaptControl, a motion generation model capable of precise spatial trajectory control while ensuring that the generated motion adheres to both text descriptions and spatial guidance. AdaptControl employs a Natural Guidance module to optimize the generated motion noise while closely following the spatial constraints and maintaining realistic dynamics. Simultaneously, the Control Dominance Adjustor adaptively fuses features from the text and spatial signals, allowing the model to consider information from both sources. Finally, the Fusion Diffusion module integrates the guided noise from the Natural Guidance module with the fused features from the Control Dominance Adjustor to generate the final output motion. Extensive experiments demonstrate that AdaptControl outperforms state-of-the-art methods in trajectory adherence, motion realism, and semantic consistency, opening up new possibilities for intuitive and interactive human motion generation and facilitating the development of user-centered design and interactive applications.
Loading