Tokenizing Nonverbal Communication in Salsa Dance

ICML 2025 Workshop TokShop Submission43 Authors

Published: 10 Jun 2025, Last Modified: 11 Jun 2025TokShopEveryoneRevisionsBibTeXCC BY 4.0
Archiving Submission: No (non-archival)
Keywords: motion capture, animation, salsa dance, dataset, motion generation, smpl-x, llm
Abstract: Partner dance offers a compelling testbed for studying tokenization in multimodal, bidirectional communication. In salsa, a lifted hand may signal a turn; musical accents may shape both dancers' motion. These interactions are continuous and improvisational, and hinge on discrete, interpretable cues—gestures, beats, and movement segments—that can be modeled as tokens. In this paper, we introduce a language model and tokenization framework for social dance, treating salsa as a form of embodied dialogue grounded in motion, music, and role-based interaction. To support this, we present CoMPAS3D, a large-scale motion capture dataset of improvised salsa dancing, capturing over 3 hours of leader-follower interaction across three skill levels. The dataset includes frame-level annotations of moves, styling, and execution errors, created through over 120 hours of expert effort. We use tokens as a foundation for generative and classification tasks, including follower motion prediction and move recognition, demonstrating the utility of token-based models for interactive, expressive virtual agents.
Submission Number: 43
Loading