A Large Scale Synthetic Dataset for MULTImodaL hATE “MULTILATE” with Text and Images and Adversarial Samples
Abstract: Nowadays, one of the main problems our society struggles with is fighting online hate. In other words, as social media explodes with multimodal hate speech content, we require scalable multimodal hate speech detection sys- tems. Thus, we present MULTILATE, a MULTImodaL hATE 2.6 million sample dataset for cross-modal hate speech classification and additional ex- planation through 3W Question Answering. Key features of the dataset include (1) textual utterances, (2) synthesized pictures produced by Stable Diffusion, (3) pixel-level temperature maps meant for explaining a picture- text interface, (4) question-answer triples addressing “who”, “what”, and “why” components of the statement,(5) Adversarial examples of both text and images. MULTILATE is aimed at creating and assessing interpretable multimodal hate speech classifiers.
Paper Type: long
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Contribution Types: Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
0 Replies
Loading