Applying Large Language Model For Relevance Search In Tencent

Dezhi Ye, Jie Liu, Junwei Hu, Jiabin Fan, Bowen Tian, Haijin Liang, Jin Ma

Published: 03 Aug 2025, Last Modified: 26 Nov 2025CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Relevance plays a crucial role in commercial search engines by identifying documents related to user queries and fulfilling their search needs. Traditional approaches employ encoder-only models like BERT, which process concatenated query-document pairs to predict relevance scores. While autoregressive large language models (LLMs) have revolutionized numerous NLP domains, their direct application to web-scale search systems presents significant challenges. On one hand, the relevance modeling capabilities of LLMs have not been fully explored. On the other, the high computational costs and inference times make deploying LLMs in online search systems, which demand extremely low latency, nearly impossible.In this work, we address these challenges through two key contributions. First, we develop a comprehensive evaluation framework to systematically assess the effectiveness of LLMs in query-document relevance ranking. By conducting assessment experiments to LLMs in four perspectives: ranking objectives, model size, domain-specific continuous pre-training, and the integration of prior knowledge, we identify the best resource allocation strategy given a restricted budget and develop practical LLMs in a more efficient way. Second, we propose a novel framework to transfer the capabilities of LLMs in the ranking aspect to existing BERT models to avoid directly deploying LLMs. Finally, to fully leverage the improvements in relevance ranking brought by LLMs, we successfully nearline deploy LLMs in Tencent QQ Browser search engine using query-based on-demand computing and quantization. Experiments on real-world datasets and online A/B tests demonstrate that our approach significantly enhances search engine performance while maintaining practical operational efficiency. Our findings provide actionable insights for integrating LLMs into production search engines.

External IDs:doi:10.1145/3711896.3737193