Wikipedia Empowered Natural Language Interface for Web Search

Published: 01 Jan 2024, Last Modified: 30 Jul 2025WISE (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Understanding queries in information retrieval (IR) is crucial. Tasks like query classification or clustering exist but may lack precision. Detailed goal descriptions, like human annotations, are vital for improving query understanding and evaluating document relevance. This paper addresses two problems: (1) how to simplify natural language questions so keyword-based search engines can understand them, and (2) how to make this process lightweight but effective. The first problem is solved by using a context graph to translate questions into keyword queries. The second uses Wikipedia for entity recognition, disambiguation, and relevance feedback. Experiments on 892 TREC questions show a 21.45% overall MRR increase, with 36.41% for “why” questions, compared to Bing, Google, and Yahoo.
Loading