Reproducibility Report: Contextualizing Hate Speech Classifiers with Post-hoc ExplanationDownload PDF

Jan 31, 2021 (edited Apr 01, 2021)ML Reproducibility Challenge 2020 Blind SubmissionReaders: Everyone
  • Keywords: Hate speech, SOC, Post-hoc explanation, regularization
  • Abstract: Scope of Reproducibility For the GHC (a dataset), the most important difference between BERT+WR and BERT+SOC is the increase in recall. While, for Stormfront (a dataset), there are similar improvements for in-domain data and the NYT dataset. But, for verifying the claims we also have tried to run the same experiment on a new data-set. Methodology We have tried to re-implement the author’s code and verify the claims made in their original paper. We have experimented on NVIDIA Tesla GPU which was less efficient than the original author’s resource (NVIDIA GeForceRTX 2080 Ti). Results We have able to reproduce claims as mentioned in the following section 2 (Scope of Reproducibility) marked as point 2 and 3. But we are not on the same page with the authors for a few reported experiments mentioned as point 1 and 4 in the same section. What was easy The original authors provide code for most of the experiments presented in the paper. The code was easy to run and allowed us to verify the correctness of our re-implementation. The explanations in the code made the work pretty easy for us. What was difficult Training of the models was very time taking as we had to wait for hours to train the model and the resources used by the original authors are not readily available everywhere. Communication with original authors We were in contact with the second author via E-mail, as he was responsive and shared details that were not explicitly mentioned in the paper.
  • Paper Url:
  • Supplementary Material: zip
3 Replies