Submission Type: Short Paper
Keywords: Theory of Mind, Sexism Detection, Affective Theory of Mind, Cognitive Theory of Mind, Annotation
TL;DR: Annotators agree that content is sexist but disagree about why and what kind, and this gap maps onto the cognitive–affective Theory of Mind distinction
Abstract: Social perception, i.e., how people form impressions from language, forms the basis for subjective NLP tasks like sexism detection. Yet the mechanisms driving annotator disagreement remain unexplained. We propose that Theory of Mind (ToM) provides this mechanism: annotators perform cognitive ToM (recognising norm-relevant content and inferring speaker intent) and affective ToM (estimating target impact), producing disagreement when these inferences diverge. Analysing the EXIST 2025 dataset (7,958 tweets, 4,044 memes), we find a detection-interpretation dissociation. Annotators who agree content is sexist show higher disagreement about speaker intent and sexism type than about detection itself. This gap replicates across tweets and memes modalities. Perceiver gender does not affect detection for text but does for memes (where even detection is ToM-demanding), shows negligible effects on intent attribution, and selectively shapes harm-related categorisation (misogyny in text, objectification in memes) while leaving abstract categories unaffected. Gender structuring thus increases with affective ToM demand. These patterns replicate established social psychology findings at computational scale and demonstrate that social perception operates through dissociable ToM processes that current NLP systems collapse.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 13
Loading