Corpus-Steered Query Expansion with Large Language Models

Anonymous

Corpus-Steered Query Expansion with Large Language Models

Anonymous

16 Oct 2023ACL ARR 2023 October Blind SubmissionReaders: Everyone

Abstract: Recent studies demonstrate that query expansion, generated by large language models (LLMs), considerably enhances information retrieval systems by generating hypothetical documents that answer the queries as expansions. However, challenges arise from misalignments between the expansions and the retrieval corpus, resulting in issues like hallucinations and outdated information due to the limited intrinsic knowledge of LLMs. Inspired by Pseudo Relevance Feedback (PRF), we introduce Corpus-Steered Query Expansion (CSQE) to promote the incorporation of authentic knowledge embedded within the corpus. CSQE utilizes the relevance assessing capability of LLMs to systematically identify pivotal sentences in the initially-retrieved documents. These corpus-originated texts are subsequently used to expand the query together with LLM-knowledge empowered expansions, bolstering the relevance between the query and the target documents. Extensive experiments reveal that CSQE exhibits remarkable performance without necessitating any training.

Paper Type: short

Research Area: Information Retrieval and Text Mining

Contribution Types: NLP engineering experiment

Languages Studied: English

0 Replies

Loading