Abstract: Wikipedia is an invaluable resource, but has been criticized for being biased in many ways. In particular, studies of Wikipedia's biographies have found that women are under-represented and portrayed in biased ways. The content of Wikipedia is produced by a complex socio-technical system, with rules and protocols that can be seen as a complex algorithm. Our overarching research problem is to evaluate whether the biases in Wikipedia are a faithful depiction of our biased society, or whether this socio-technical system creates its own biases, in a phenomenon akin to algorithmic bias. In this paper, we revisit two concepts of gender bias in Wikipedia, namely the under-representation and the biased representation of women, which correspond to distinct concepts of algorithmic bias. In order to quantify systematic differences in the representation of women we use classifiers applied to neural representations of the text. We show that a large part of the measurable difference comes from men and women being notable for reasons conforming to traditional gender roles, rather than biased representations introduced in Wikipedia.
External IDs:dblp:conf/ai/DamadiD24
Loading