S$^{3}$PRank: Toward Satisfaction-Oriented Learning to Rank With Semi-Supervised Pre-Training

Yuchen Li, Zhonghao Lyu, Yongqi Zhang, Hao Zhang, Tianhao Peng, Haoyi Xiong, Shuaiqiang Wang, Linghe Kong, Guihai Chen, Dawei Yin

Published: 2026, Last Modified: 22 Jan 2026IEEE Trans. Knowl. Data Eng. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Learning-to-Rank (LTR) models built on Transformers have been widely adopted to achieve commendable performance in web search. However, these models predominantly emphasize relevance, often overlooking broader aspects of user satisfaction such as quality, authority, and recency, which collectively enhance the overall user experience. Addressing these multifaceted elements is essential for developing more effective and user-centric search engines. Nevertheless, training such comprehensive models remains challenging due to the scarcity of annotated query-webpage pairs relative to the vast number of webpages available online and the billions of daily search queries. Concurrently, industry research communities have released numerous open-source LTR datasets with well-annotated samples, though these datasets feature diverse designs of LTR features and labels across heterogeneous domains. Inspired by recent advancements in pre-training transformers for enhanced performance, this work explores the pre-training of LTR models using both labeled and unlabeled samples. Specifically, we leverage well-annotated samples from heterogeneous open-source LTR datasets to bolster the pre-training process and integrate multifaceted satisfaction features during the fine-tuning stage. In this paper, we propose S$^{3}$3PRank—Satisfaction-oriented Learning to Rank with Semi-supervised Pre-training. Specifically, S$^{3}$PRank employs a three-step approach: (1) it exploits unlabeled/labeled data from the search engine to pre-train a self-attentive encoder via semi-supervised learning; (2) it incorporates multiple open-source heterogeneous LTR datasets to enhance the pre-training of the relevance tower through shared parameters in cross-domain learning; (3) it integrates a satisfaction tower with the pre-trained relevance tower to form a deep two-tower aggregation structure, and fine-tunes the combination of pre-trained self-attentive encoder and the two-tower structure using search engine data with various learning strategies. To demonstrate the effectiveness of our proposed approach, we conduct extensive offline and online evaluations using real-world web traffic from Baidu Search. The comparisons against numbers of advanced baselines confirmed the advantages of S$^{3}$PRank in producing high-performance ranking models for web-scale search.

External IDs:dblp:journals/tkde/LiLZZPXWKCY26