Abstract: Citations are essential building blocks in scientific knowledge production. Citation content analysis using NLP methods has been proposed to benefit tasks such as scientific paper summarization and research impact assessment. In this paper, we propose a new task, citation subject matter extraction, and augment an existing citation sentiment corpus with citation context and subject matter annotations to enable a finer-grained study of citation content. We propose a BERT-based multi-task model to jointly address these three classification tasks (i.e., context, subject matter, and sentiment) by enabling knowledge transfer across tasks. Our experimental results show the effectiveness of our joint model over single task models. We also obtain state-of-the-art results for the citation sentiment classification task and demonstrate that isolating the subject matter significantly improves this task. Our error analysis suggests improving annotation consistency and using external knowledge sources could further improve performance. We will make our code, data, and annotation guidelines publicly available upon acceptance.
Paper Type: long
0 Replies
Loading