[Re] Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate Speech Classification

Priyanka Bose; Chandra Shekhar Pandey; Fraida Fund

[Re] Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate Speech Classification

Priyanka Bose, Chandra Shekhar Pandey, Fraida Fund

Published: 02 Aug 2023, Last Modified: 02 Aug 2023MLRC 2022Readers: Everyone

Keywords: Machine Learning, Bias, BERT, Word2Vec, DWMW17, FDCL18, Blodgett Et Al, Transformer, TensorFlow

TL;DR: We are validating the claim of the paper that there is a considerable reduction in hate speech, offensive speech, and abusive speech classification when AAE tweets are censored for ”swear” words.

Abstract: Reproducibility Summary Scope of Reproducibility — We aim to reproduce a result from the paper “Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate Speech Classification” [1]. Our study is restricted specifically to the claim that the use of swear words impacts hate speech classification of AAE text. We were able to broadly validate the claim of the paper, however, the magnitude of the effect was dependent on the word replacement strategy, which was somewhat ambiguous in the original paper. Methodology — The authors did not publish source code. Therefore, we reproduce the experiments by following the methodology described in the paper. We train BERT models from TensorFlow Hub [2] to classify hate speech using the DWMW17[3] and FDCL18[4] Twitter datasets. Then, we compile a dictionary of swear words and replacement words with comparable meaning, and we use this to create “censored” versions of samples in Blodgett et al.’s[5] AAE Twitter dataset. Using the BERT models, we evaluate the hate speech classification of the original data and the censored data. Our experiments are conducted on an open‐access research testbed, Chameleon [6], and we make available both our code and instructions for reproducing the result on the shared facility. Results — Our results are consistent with the claim that the censored text (without swear words) is less often classified as hate speech, offensive, or abusive than the same text with swear words. However, we find the classification is very sensitive to the word replacement dictionary being used. What was easy — The authors used three well known datasets which were easy to obtain. They also used well‐known widely available models, BERT and Word2Vec. What was difficult — Some of the details of model training were not fully specified. Also, we were not able to exactly re‐create a comparable dictionary of swear words and their replacement terms of similar meanings, following their methodology. Communication with original authors — We reached out to the authors, but were not able to communicate with them before the submission of this report.

Paper Url: https://dl.acm.org/doi/pdf/10.1145/3531146.3533144

Paper Venue: ACM FAccT 2022

Supplementary Material: zip

Confirmation: The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page.

Latex: zip

Journal: ReScience Volume 9 Issue 2 Article 35

Doi: https://www.doi.org/10.5281/zenodo.8173739

Code: https://archive.softwareheritage.org/swh:1:dir:e4878650f9ab13d7820684abaca84448cc13b9ff

0 Replies

Loading