Towards Community-Driven NLP: Measuring Geographic Performance Disparities of Offensive Language ClassifiersDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Text classifiers are applied at scale in the form of one-size-fits-all solutions. Nevertheless, many studies show that many classifiers are biased regarding different languages and dialects. Both language style and content change depending on the location that it is posted. For example, states that border Mexico may be more likely to discuss issues regarding immigration from Latin America. However, several questions remain, such as ``Do changes in the style and content of text across geographic regions impact model performance?''. We introduce a novel dataset called GeoOLID with more than 13 thousand examples across 15 geographically and demographically diverse cites to address this question. Furthermore, we perform a comprehensive analysis of geographical content and stylistic differences and their interaction in causing performance disparities of Offensive Language Detection models. Overall, we find that current models do not generalize across. Likewise, we show that understanding broad dialects (e.g., African American English) is not the only predictive factor of model performance when applied to cities with large minority populations. Hence, community-specific evaluation is vital for real-world applications. Warning: This paper contains offensive language.
0 Replies

Loading