Avoiding Dilution but Preserving Context: Fine-Tuning Incivility Classifier with Sentence-level Auxiliary Signals on Data from War-Related Subreddits
Keywords: Fine-tuning classifiers, Incivility, LLM efficiency, Reddit
Abstract: This paper seeks to overcome current shortcomings in the literature by detecting content-focused incivility, understood as speech that targets people, while comparing two different classification schemes across three Transformer-based models: a pure-comment classification method vis-á-vis a joint sentence-comment classification method. The former trains a supervised objective on comment labels; the latter trains a joint supervised objective on both comment and sentence labels. The comparison is carried out on a relatively large, human-annotated, stratified dataset ($N=7,941$) collected from a likely multicultural online setting, namely two war-related subreddits (r/IsraelPalestine and r/UkrainianConflict). The findings highlight small performance gains when training an objective supervised on both comment and sentence labels, and this gain is consistent across seeds and architectures (BERT, RoBERTa, and BERTweet).
Paper Type: Long
Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good
Research Area Keywords: hate-speech detection, NLP tools for social analysis, quantitative analyses of news and/or social media
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 10276
Loading