Second order probabilistic models for within-document novelty detection in academic articlesOpen Website

2014 (modified: 12 Nov 2022)SIGIR 2014Readers: Everyone
Abstract: It is becoming increasingly difficult to stay aware of the state-of-the-art in any research field due to the exponential increase in the number of academic publications. This problem effects authors and reviewers of submissions to academic journals and conferences, who must be able to identify which portions of an article are novel and which are not. Therefore, having a process to automatically judge the flow of novelty though a document would assist academics in their quest for truth. In this article, we propose the concept of Within Document Novelty Location, a method of identifying locations of novelty and non-novelty within a given document. In this preliminary investigation, we examine if a second order statistical model has any benefit, in terms of accuracy and confidence, over a simpler first order model. Experiments on 928 text sequences taken from three academic articles showed that the second order model provided a significant increase in novelty location accuracy for two of the three documents. There was no significant difference in accuracy for the remaining document, which is likely to be due to the absence of context analysis.
0 Replies

Loading