Beyond Consensus: Use of Demographics for Datasets that Reflect Annotator Disagreement

Published: 26 Jul 2025, Last Modified: 06 Oct 2025NLPOR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Disagreement, demographics, attention, sexism detection, Irony detection
TL;DR: Subjective Classification through Demographic-Aware Learning from Annotator Disagreement
Submission Type: Archival
Abstract: Annotator disagreement in subjective NLP tasks often reflects meaningful differences in perspective tied to demographic identity. To model this variation, we propose the Annotation-Wise Attention Network (AWAN), a demographic-aware model that learns to predict individual annotations using annotator meta-information. AWAN conditions token-level attention on demographic bundles to generate perspective-specific representations. We evaluate AWAN on two datasets, (\textsc{EXIST} (sexism detection) and \textsc{EPICorpus} (irony detection)), showing consistent improvements over single- and multi-task baselines. We further explore how different combinations of demographic features affect performance, finding that simple, well-represented features (in the EPICorpus dataset \emph{employment, nationality}) yield strong results, while imbalanced features (in EXIST: \emph{study level}) can reduce model effectiveness. Our results show the promise of incorporating demographic context to model subjective variation in annotation.
Submission Number: 24
Loading