Abstract: Identifying linguistic bias in text requires understanding what is said and what is meant. This requires going beyond what is being asserted directly, and determining what is presupposed. Large language models (LLMs) represent a potential automatic approach for identifying presupposed content, but it is unknown how well LLM judgments correspond to human judgments. Further, LLMs may exhibit their own biases in determining what is presupposed. To study this empirically, we prompt multiple LLMs to make presupposition judgments for texts of varying domains from three different human-labeled datasets. We calculate the agreement between LLMs and human raters, and find that variations in text domain, verb factivity, context window size, and the type of presupposition trigger result in changes to human-model agreement scores. We also observe discrepancies in agreement scores that indicate potential biases from LLMs. The gender of the subject appears to impact agreement, as female pronouns are associated with lower agreement than male pronouns. Across multiple dimensions, differences in political ideology appear to correspond to differences in agreement.
Paper Type: long
Research Area: Ethics, Bias, and Fairness
Contribution Types: Model analysis & interpretability
Languages Studied: English
0 Replies
Loading