Music-to-Dance Poses: Learning to Retrieve Dance Poses from Music

Published: 01 Jan 2024, Last Modified: 27 Feb 2025ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Choreography is an artful blend of technique and creativity, requiring the meticulous design of movement sequences in harmony with music. To support choreographers in this intricate task, this work proposes a "music-to-dance pose retrieval" system that uses music snippets to retrieve dance poses, predicts 3D human poses and shapes, and then matches them within the 3D pose and shape space. Central to our method is the EDSA adapter, a Self-Attention adapter that utilizes an Encoder-Decoder transformation, allowing a large-scale pre-trained music model to be fine-tuned effectively and efficiently for learning projection from music snippets to 3D human poses and shapes. Experimental results demonstrate that our EDSA adapter outperforms existing techniques for fine-tuning a large-scale pre-trained model in cross-modal music-to-dance pose retrieval task.
Loading