Entity-based query reformulation using wikipedia

Published: 2008, Last Modified: 09 Dec 2024CIKM 2008EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Many real world applications increasingly involve both structured data and text, and entity based retrieval is an important problem in this realm. In this paper, we present an automatic query reformulation approach based on entities detected in each query. The aim is to utilize semantics associated with entities for enhancing document retrieval. This is done by expanding a query with terms/phrases related to entities in the query. We exploit Wikipedia as a large repository of entity information. Our reformulated approach consists of three major steps : (1) detect representative entity in a query; (2) expand the query with entity related terms/phrases; and (3) facilitate term dependency features. We evaluate our approach in ad-hoc retrieval task on four TREC collections, including two large web collections. Experiments results show that significant improvement is possible by utilizing entity corresponding information.
Loading