Keywords: paper contributions, re-annotation, llm as annotator, authors reliability, arr process
Abstract: With the rapid growth of scientific publications, researchers struggle to efficiently assess the relevance of numerous papers. Identifying the types of contributions an article makes can help readers quickly grasp its significance. The ACL Rolling Review (ARR) introduced a typology requiring authors to specify their contributions to improve review quality and fairness. However, the current typology lacks clear definitions and guidance, leading to inconsistent labeling and raising concerns about its reliability.
Our extensive re-annotation campaign reveals substantial disagreement between authors and domain experts. Evaluation of LLMs on paper contribution identification shows competitive performance relative to authors against annotator consensus, highlighting a potential path toward more reliable annotation.
Paper Type: Short
Research Area: Resources and Evaluation
Research Area Keywords: automatic creation and evaluation of language resources,NLP datasets
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: english
Submission Number: 5567
Loading