Understanding and Tackling Label Errors in Individual-Level Natural Language Understanding

Understanding and Tackling Label Errors in Individual-Level Natural Language Understanding

ACL ARR 2026 January Submission6651 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: stance detection, sentiment analysis, label errors

Abstract: Natural language understanding (NLU) is a task that enables machines to understand human language. Some tasks are closely related to individual subjective perspectives, thus termed individual-level NLU. Previously, these tasks are often simplified to text-level NLU tasks, ignoring individual factors such as demographic information or worldview over a period of time. This can lead to a large number of label errors and also make inference difficult and unexplainable. To address the above limitations, we propose a new NLU annotation guideline based on individual-level factors. We find that the error rate in the dataset samples exceeded 20\%, with the highest reaching 48.7\%. We further use large language models to conduct experiments on the re-annotation datasets and find that the large language models perform well on these datasets after re-annotation. We also verify the effectiveness of individual factors through ablation studies. Our re-annotation dataset can be found at https://anonymous.4open.science/r/Individual-NLU-A0DE

Paper Type: Long

Research Area: Sentiment Analysis, Stylistic Analysis, and Argument Mining

Research Area Keywords: Sentiment Analysis, Stylistic Analysis, and Argument Mining, Computational Social Science, Cultural Analytics, and NLP for Social Good

Languages Studied: English

Submission Number: 6651

Loading