We are confident in trusting Language Models for annotating online sexism in political discourse, but are they good?
Keywords: online sexism detection, political discourse, language model annotation reliability, model confidence estimation
TL;DR: We test the trust of using prompt engineering in LLMs for annotating online sexism in political discourse, while also providing research insights to guide researchers in checking the reliability of LLMs in such sensitive tasks.
Abstract: Large Language Models (LLMs) have recently gained popularity for text analysis within the social sciences due to their versatility and context-aware capabilities. The use of prompt-based learning of LLMs has especially increased its application in classification tasks and text annotation of sensitive topics like sexism. While studies have used them for capturing online sexism, not much has been known of their capabilities across lesser-known discourses like that of political discourse, and how the models distinguish between partisan bias to gender bias. In this research, our main contributions could be listed as: i) comparing different LLMs through prompt engineering in their capability of detecting sexism in political discourse; and ii) proposing a new algorithm for capturing the confidence of the LLM predictions in classification tasks. Experimental results demonstrate a clear indication of trigger events that provoke online sexism, and yet no clear advantage of using LLMs while predicting sexism. Surprisingly, the results do not improve with more instructive prompts, but our algorithm proves to be effective in capturing the confidence of each model on their predicted labels.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10219
Loading