Mitigating Biases to Embracing Diversity: A Comprehensive Annotation Benchmark for Toxic Language

ACL ARR 2024 June Submission733 Authors

12 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: This study proposes a prescriptive annotation benchmark grounded in humanities research to enable consistent and reliable offensive language data labeling while mitigating biases against language minorities. We contribute two newly annotated datasets based on the proposed benchmark, leading to higher inter-annotator agreement between human and language model (LLM) annotations compared to original annotations based on descriptive instructions. Experiments show that LLMs could be an alternative when professional annotators are unavailable. Smaller models fine-tuned on a multi-source LLM-annotated dataset outperform models trained on a single, larger human-annotated dataset. The findings demonstrate the effectiveness of structured guidelines in controlling subjective variability while maintaining performance with limited data size and heterogeneous language types, thus embracing language diversity. $\textbf{Content Warning}$: This article only analyzes offensive language for academic purposes. Discretion is advised.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: Ethics, Bias, and Fairness, Resources and Evaluation, Semantics: Lexical and Sentence-Level
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency, Data resources, Data analysis, Surveys
Languages Studied: English
Submission Number: 733
Loading