Abstract: Machine Learning, increasingly present in everyday life, is subject to bias. These biases not only reflect social inequalities but also reinforce them. The present study seeks to mitigate gender bias in Google Translate, the most used translation system in the world. For this, we created a translation model with high gender accuracy and performed a linguistic analysis with the spaCy tool and entity identification with roBERTa. The Constrained Beam Search technique is used to maintain the sentence structure of the business model, but with the replacement for the correct genre indicated by the created model. The final sentence is the result of an alignment done with the SimAlign tool. In addition, the present study also produces an algorithm so that sentences without gender indication in English present translations with inflection for feminine, masculine, and neutral gender. Our approach yields a BLEU score of 48.39. In relation to Google Translate, the model increased gender accuracy from 68.75 to 70.09, enhanced in 15.7% the score that measures the difference in accuracy between male and female entities, and improved stereotyped translations in 43%.
Loading