Towards Identification of Microaggressions in real-life and Scripted conversations, using Context-Aware Machine Learning Techniques.Download PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Microaggression, social conversations, Natural language Processing, Contextual Model, RoBERTa, Support Vector Machines.
TL;DR: Classification of microaggressions from text data (extracted from real life conversation, social media platforms and classic TV shows) using SVMs and RoBERTA, considering the impact of varying amounts of context on overall models' performance.
Abstract: The advent and rapid proliferation of social media have brought with it an exponential growth in hate speech and overt offensive language, with one of the most subtle yet pervasive subcategories of hate speech being Microaggressions (MA). MAs are unintentional, hostile, derogatory, or negative prejudicial slights and insults toward any group, particularly culturally marginalized communities and growing bodies of research are linking long-term MA exposure to serious health problems. The scarcity of studies leveraging AI techniques to identify MAs in text and in spoken conversations, coupled with the lack of investigative analysis on the impact of context on the performance of algorithms used for this task, makes this a relevant topic for the AI community. In this paper, we explore the degree of effectiveness of MAs detection often found in spoken human communications across various contexts (e.g., workplace, social media, conversations) using Machine Learning models. We further examine the extent that art may imitate life, by comparing the ability of these models trained on real-life conversations to infer MAs, occurring in scripted Television shows. We apply a Support Vector Machine (SVM) classifier using N-grams and contextual modeling representation, using the Robustly Optimized Bidirectional Encoder Representation for Transformer (RoBERTa) model, whose performance is evaluated based on its pretraining size and ability to accurately detect hate speeches, with comparative results from BERT based-uncased and the HateBERT model respectively. Overall, the results show that contextual transformer models outperform simpler context-free approaches to classifying MAs collected from surveys and online blogs. We also found that these models trained on real-life conversations could infer MAs in scripted TV settings, though at reduced levels, and equal rates, suggesting there may be a disconnect between contexts of MA found in art and those from real life.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)
5 Replies

Loading