Word document density and relevance scoringDownload PDFOpen Website

2000 (modified: 12 Nov 2022)SIGIR 2000Readers: Everyone
Abstract: Previous work addressing the issue of word distribution in documents has shown the importance of Word repetitiveness as an indicator of the word content-bearing characteristics. In this paper we propose a simple method using a measure of the tendency of words to repeat within a document to separate the words with similar document frequencies, but different topic discriminating characteristics. We describe the application of the new measure in query-document relevance scoring. Experiments on the TREC Ad Hoc and Spoken Document Retrieval tasks [7] show useful performance improvements.
0 Replies

Loading