Data augmentation and debiasing for signers in signer-independent sign language translation

Published: 01 Jan 2025, Last Modified: 31 Jul 2025J. Supercomput. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Sign language translation (SLT) aims to convert sign language videos into corresponding text, bridging the communication gap between the deaf and hearing communities. A practical SLT system must generalize well to unseen signers. This requirement necessitates that the SLT system maintains good performance in a signer-independent setting, where the model encounters signers who were not present in the training data. However, existing sign language datasets often lack diversity in signers and exhibit biases toward signers, which leads to suboptimal generalization capabilities of current SLT models. Moreover, current signer-independent SLT methods heavily rely on the multiple signer identity labels in the dataset. To address this issue, we propose a method that does not require signer identity labels: DADS (data augmentation and debiasing for signer) for signer-independent sign language translation. Firstly, we propose signers-oriented data augmentation, which consists of two components: the data augmentation based on adversarial training (DAAT) and data augmentation based on diffusion model (DADM). DAAT uses model gradients to create adversarial examples, while DADM employs diverse prompts to produce high-quality, signer-diverse sign language data. By combining these two data augmentation methods, we alleviate the scarcity of sign language data and significantly enhance the diversity of signers in the dataset. Secondly, we introduce signer-oriented debiasing (SD). Specifically, we use a signer-biased classifier to capture the biases present in the dataset and a signer-debiased classifier to learn debiased features. This process helps to comprehensively eliminate the model’s bias toward signers, thereby improving its generalization and robustness in signer-independent scenarios. Our experimental results demonstrate that DADS enhances the performance of models in signer-independent setting and also shows excellent performance under more challenging common corruption setting.
Loading