Studying and Mitigating Biases in Sign Language Understanding Models

ACL ARR 2024 June Submission4037 Authors

16 Jun 2024 (modified: 06 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Crowdsourced sign datasets collected with the involvement of deaf communities, such as the ASL Citizen dataset, represent an important step towards improved accessibility and documentation of signed languages. However, it is important to ensure that these resources benefit people in an equitable manner. Thus, there is a need to understand the potential biases that may result from models trained on sign language datasets. In this work, we utilize the rich information about participant demographics and lexical features present in the ASL Citizen dataset to study and document the biases that may result from models trained on crowdsourced sign datasets. Further, we apply several bias mitigation techniques during model training, and discuss the results and relative success of these techniques. In addition to our analyses and machine learning experiments, with the publication of this work we release the demographic information about the participants in the ASL Citizen dataset to encourage future work in this space.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: model bias/fairness evaluation, model bias/unfairness mitigation, multimodality, language resources, evaluation
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Data analysis
Languages Studied: American Sign Language
Submission Number: 4037
Loading