Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech TranslationDownload PDF


16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Gender bias is largely recognized as a problematic phenomenon affecting language technologies, with recent studies underscoring that it might surface differently across languages. However, most evaluation practices adopt a word-level focus on a narrow set of occupational nouns under synthetic conditions. Such protocols overlook key features of grammatical gender languages, which are characterized by morphosyntactic chains of gender agreement, marked on a variety of lexical items and parts-of-speech (POS). To overcome this limitation, we enrich the natural, gender-sensitive MuST-SHE corpus with two new annotation layers: POS and agreement chains. On this basis, we conduct multifaceted automatic and manual evaluations for three speech translation models, trained on varying amounts of data and different word segmentation techniques. Our work sheds light on model behaviours, gender bias, and its detection at several levels of granularity for English-French/Italian/Spanish.
0 Replies
