- Keywords: explainability, hate speech, xai, text classification, interpretability
- TL;DR: Merge of XAI with text classification. Term level global feature importance penalizes model predictions when local term feature importance differs from the global importance.
- Abstract: As social distancing, self-quarantines, and travel restrictions have shifted a lot of pandemic conversations to social media so does the spread of hate speech. While recent machine learning solutions for automated hate and offensive speech identification are available on Twitter, there are issues with their interpretability. We propose a novel use of learned feature importance which improves upon the performance of prior state-of-the-art text classification techniques, while producing more easily interpretable decisions. We also discuss both technical and practical challenges that remain for this task.