Contrastive disentanglement for self-supervised motion style transfer

Published: 01 Jan 2024, Last Modified: 11 Nov 2024Multim. Tools Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Motion style transfer, which aims to transfer the style from a source motion to the target while keeping its content, has recently gained considerable attention. Some existing works have shown promising results but required labeled data for supervised training, limiting their applicability. In this paper, we present a novel self-supervised learning method for motion style transfer. Specifically, we cast the problem into a contrastive learning framework, which disentangles the human motion representation into a content code and a style code, and the result can be generated by compositing the style code of source motion and the content code of target motion. To encourage better code disentanglement and composition, we investigate InfoNCE loss and Triplet loss in a self-supervised manner. This framework aims at generating reasonable motions while guaranteeing the disentanglement of the latent codes. Comprehensive experiments have been conducted over the benchmark datasets and demonstrated our superior performance over state-of-the-art methods.
Loading