Accelerating NLP for Health Equity: Fine-Tuning Binary and Multi-Class Stigma Classifiers in 48 Hours
Keywords: Natural Language Processing, Stigma Detection, Mental Health, Clinical NLP, Transformer Models, Social Bias in Language, Health Equity
Abstract: Stigmatizing language in mental health discourse contributes to social exclusion, reduced help-seeking, and poorer health outcomes. Yet, detecting such language remains challenging due to its subtle, context-dependent, and overlapping nature. To address this, prior work introduced an expert-annotated corpus of 4,141 text snippets and established strong transformer-based baselines for stigma classification.
Building on this foundation, we make three key advances:
(1) we fine-tune multiple models and apply explainable AI (XAI) methods to enable transparent interpretation of model behavior;
(2) we adopt a rigorous evaluation framework with stratified cross-validation and detailed performance metrics, including macro F1 and bootstrap-based confidence intervals; and
(3) we release a fully reproducible notebook designed for replication by both human researchers and AI agents. Using our agent-based system, we completed both binary (2-class) and multi-class (8-class) stigma classification tasks in under 48 hours, with XAI applied throughout. These contributions go beyond benchmark replication, advancing toward interpretable, trustworthy, and deployable stigma detection systems for clinical, public health, and digital moderation settings. By demonstrating the effectiveness of large language models in identifying nuanced forms of stigma, this work lays the foundation for socially responsible NLP systems that support bias-aware communication across health-related domains. To support community adoption and reproducibility, we have released our full pipeline at:
\href{https://anonymous.4open.science/r/end-stigma/}{https://anonymous.4open.science/r/end-stigma/}.
Submission Number: 237
Loading