Evaluating Gender Bias in the Translation of Gender-Neutral Educational Professions from English to Gendered Languages
This study evaluates the translation of gender-neutral English words to the gendered languages German, French and Italian, using 5 machine translation (MT) models: GPT-3.5 Turbo, LLaMA 2, AWS, SYS, and Google. Focusing on translating educational professions, each model's output was categorized into four gender classifications: unknown (UNK), female (f), male (m), and neutral (n). Error rates were determined through human validation, involving manual review of randomly sampled records. Our findings reveal significant gender bias across all tested MT systems, with a notable over representation of male gender classifications.