The Duality of Hope: A Critical Examination of Controversial Annotations in HopeEDI

Published: 19 Mar 2024, Last Modified: 30 May 2024Tiny Papers @ ICLR 2024 PresentEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Hope Speech, NLP, Social Good, Social Computing, HopeEDI dataset, Annotation Errors, Reannotation
TL;DR: The study flags problems in the HopeEDI dataset and mislabeling of negative content as hopeful. It warns against biased models, introduces a 'Potentially Controversial' class, and aims to improve balance and ethical considerations in real-world use.
Abstract: This study investigates the HopeEDI hope speech dataset, revealing a significant number of potentially controversial annotations, notably tied to the 'All Lives Matter' movement. We have also identified instances where hateful/toxic/implicitly controversial content was wrongly marked as hopeful. The implications for deploying models trained on this dataset are profound, risking biases and stigmatization. We advocate for thoroughly examining the HopeEDI dataset, cautioning against biased models. We reannotate the hope speech and non-english labelled text, introducing a new class, 'Potentially Controversial', providing reasons for why the label was changed. This updated dataset aims to promote balance and mitigate ethical concerns in real-world applications.
Supplementary Material: zip
Submission Number: 213
Loading