Abstract: As our understanding of autism and ableism continues to increase, so does our understanding of ableist language towards autistic people. Such language poses a significant challenge in NLP research due to its subtle and context-dependent nature. Yet, detecting anti-autistic ableist language remains underexplored, with existing NLP tools often failing to capture its nuanced expressions. We present AUTALIC, the first dataset dedicated to the detection of anti-autistic ableist language in context, addressing a significant gap in the field. AUTALIC comprises 2,400 autism-related sentences collected from Reddit, accompanied by surrounding context, and annotated by trained experts with backgrounds in neurodiversity. Our comprehensive evaluation reveals that current language models, including state-of-the-art LLMs, struggle both to reliably identify anti-autistic ableism and to align with human judgments, underscoring their limitations in this domain. We publicly release AUTALIC along with the individual annotations. This dataset serves as a crucial step towards developing more inclusive and context-aware NLP systems that better reflect diverse perspectives.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: hate-speech detection, ableism detection, autism, language/cultural bias analysis, data ethics, model bias/fairness evaluation, human-centered NLP, participatory/community-based NLP, corpus creation, NLP datasets, evaluation methodologies
Contribution Types: Data resources
Languages Studied: English
Submission Number: 4773
Loading