Abstract: The increasing prevalence of online sexual harassment reports highlights the need for effective automated tools to analyze these personal accounts. In this study, we evaluate a range of models, from neural networks to small and large language models, on the SafeCity dataset to classify incidents of sexual harassment, including commenting, ogling, and groping. We found that different model architectures perform best for different types of harassment, underscoring the need for targeted model selection. Specifically, CNN-RNN models are the most effective for detecting "ogling", BERT-FT excels in identifying "commenting", and DeepSeek7B-FT LLM performs best for "groping" related cases. To integrate these complementary strengths, we introduce AD-ASH, an adaptive ensemble framework that automatically selects the highest-performing model for each category of harassment. By dynamically matching models to task types, AD-ASH achieves state-of-the-art accuracy ranging from 84\% to 88\% across classes. This adaptive approach offers a robust solution for the nuanced task of harassment classification, demonstrating improved performance over single-model baselines. Our findings highlight the importance of model specialization and ensemble learning in sensitive, real-world applications. Supplementary analyses, including word clustering and LIME-based interpretation of model predictions, are provided in the appendix to offer further insight into language cues that drive classification.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: hate-speech detection, NLP tools for social analysis, participatory/community-based NLP
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 5516
Loading