SOS: Systematic Offensive Stereotyping Bias in Word EmbeddingsDownload PDF

Anonymous

17 Sept 2021 (modified: 05 May 2023)ACL ARR 2021 September Blind SubmissionReaders: Everyone
Abstract: Hate speech detection models aim to provide a safe environment for marginalised social groups to express themselves. However, the bias in these models could lead to silencing those groups. In this paper, we introduce the systematic offensive stereotyping (SOS) bias metric. We propose a method to measure the SOS bias in different word embeddings and also investigate its influence on the downstream task of hate speech detection. Our results show that SOS bias against various groups exists in widely used word embeddings and that, in most cases, our SOS bias metric correlates positively with the bias statistics of published surveys on online abuse and hate. However, we found that it is not easy to prove that bias in word embeddings influences downstream task performance. Finally, we show that our SOS bias metric is more indicative of sexism and racism in the inspected word embeddings when used for sexism and racism detection than the stereotypical social biases.
0 Replies

Loading