Paper Link: https://openreview.net/forum?id=cSYOakZR4m-
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: Hate speech detection models aim to provide a safe environment for marginalised social groups to express themselves. However, the bias in these models could lead to silencing those groups. In this paper, we introduce the systematic offensive stereotyping (SOS) bias. We propose a method to measure the SOS bias in different word embeddings and also investigate its influence on the downstream task of hate speech detection. Our results show that SOS bias against various groups exists in widely used word embeddings and that our SOS bias metric correlates positively with the statistics of published surveys on online abuse and \added[id=fa]{extremism}. However, we found that it is not easy to prove that bias in word embeddings influences downstream task performance. Finally, we show that SOS bias is more indicative of sexism and racism in the inspected word embeddings when used for sexism and racism detection than social biases.
0 Replies
Loading