Unraveling the Complexities of Offensive Language: A Detailed Analytical Framework for Understanding Offensive Communication DynamicsDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: This paper offers a comprehensive analysis of the criteria used to identify and categorize offensive language, delving into the complex dynamics of toxicity in communication and data construction.
Abstract: Offensive online content can marginalize and cause harm to groups and individuals. To prevent harm while ensuring speech rights, fair and accurate detection is required. However, current models and data struggle to distinguish offensive language from acceptable non-toxic linguistic variations related to culture or subjective interpretation. This study presents a comprehensive toxicity assessment with two annotated datasets focusing on nuances of human interpretation with objective evaluation. The significant improvement in inter-annotator agreement suggests uncontrollable subjectivity and research biases can arise without structured guidelines. Additionally, we explore the effectiveness of in-context learning with few-shot examples to improve toxicity detection from large language models (LLMs), GPTs specifically, finding that explicit assessment criteria significantly improve agreement between automated and human evaluations of offensive content. The feasibility of criteria-based auto-annotations is evidenced by the better performance of smaller models fine-tuned on 10 times less auto-annotated data with multi-language variations. The findings demonstrate notable efficiency in combining contextual understanding of LLMs with criterion-guided learning. Content Warning: This article only analyzes offensive language for academic purposes. Discretion advised.
Paper Type: long
Research Area: Computational Social Science and Cultural Analytics
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Data resources, Data analysis
Languages Studied: English
0 Replies

Loading