EASY: Enhanced Analysis Approach for Implicit Hate Speech Yield – Bridging Human Insight and Algorithmic Precision in Social Media Discourse

Abstract: The spread of abusive speech on social media influenced by genders, religions and context is a persistent challenge for hate speech detection. Previous researches focused on model-centric approaches and often overlooked the differences in how models and humans interpret offensive data. We propose a different approach that takes into account this prediction discrepancy for detecting hate speech more accurately. We advocate for the exclusion of sentences from the training dataset that are easily classified as hate speech by models but are challenging for humans. Our experiments on various datasets confirms that it is better to consider human agreement levels during the data preprocessing to improve the model generalization. This deviation underlines the unique challenges of hate speech domains, emphasizing the importance of datasets that reflect both model interpretations and human consensus. The analysis highlights the significance of a balanced dataset preparation approach to enhance the effectiveness and reliability of hate speech detection.
