Shuffleaugment: A Data Augmentation Method Using Time Shuffling

Yoshinao Sato, Narumitsu Ikeda, Hirokazu Takahashi

Published: 2023, Last Modified: 06 Mar 2025ICASSP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We present ShuffleAugment, a data augmentation method for speech processing that randomly shuffles data in the time direction. Every speech processing task has a characteristic time scale depending on the phenomenon it addresses. The proposed method randomizes the time order of an input sequence on the irrelevant time scales and obtains many variants without sacrificing the essential information on the proper time scale. The shuffling process can be implemented as a neural network layer and applied to low- and high-level features at an arbitrary depth. We evaluate the efficiency of the proposed method by applying it to two tasks: speaker recognition and speech emotion recognition. Our experiments demonstrated that long-term and short-term shuffles improved the performance of speaker recognition and speech emotion recognition, respectively. These results indicate that ShuffleAugment is an effective data augmentation method.