Abstract: Document expansion has been shown to improve the effectiveness of information retrieval systems by augmenting documents' term probability estimates with those of similar documents, producing higher quality document representations. We propose a method to further improve document models by utilizing external collections as part of the document expansion process. Our approach is based on relevance modeling, a popular form of pseudo-relevance feedback; however, where relevance modeling is concerned with query expansion, we are concerned with document expansion. Our experiments demonstrate that the proposed model improves ad-hoc document retrieval effectiveness on a variety of corpus types, with a particular benefit on more heterogeneous collections of documents.
0 Replies
Loading