Pseudo-Relevance-Driven Query Expansion Using BERT

Published: 2024, Last Modified: 21 Jan 2026WI/IAT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Query Expansion (QE) techniques in Information Retrieval (IR) aim to improve system performance by adding relevant terms to user queries. However, traditional QE methods often only consider basic features, such as term frequency, and miss the full meaning of queries. This paper introduces a new QE framework, named BKRoc, which combines BERT and the Rocchio model. The framework uses BERT to extract important semantic information from pseudo-relevance documents and combines this with information from the BM25 model to achieve more accurate query expansion. Experimental results demonstrate that the BKRoc framework consistently outperforms traditional baseline models across several standard datasets, particularly in terms of MAP and P@10. Overall, We propose a query expansion framework that combines deep semantic models with traditional models, demonstrating the positive impact of BERT in QE and suggesting new directions for future BERT research in IR.
Loading