everyone
since 19 Mar 2024">EveryoneRevisionsBibTeXCC BY 4.0
This study investigates the HopeEDI hope speech dataset, revealing a significant number of potentially controversial annotations, notably tied to the 'All Lives Matter' movement. We have also identified instances where hateful/toxic/implicitly controversial content was wrongly marked as hopeful. The implications for deploying models trained on this dataset are profound, risking biases and stigmatization. We advocate for thoroughly examining the HopeEDI dataset, cautioning against biased models. We reannotate the hope speech and non-english labelled text, introducing a new class, 'Potentially Controversial', providing reasons for why the label was changed. This updated dataset aims to promote balance and mitigate ethical concerns in real-world applications.