Distributionally Robust Optimization for Unbiased Learning to Rank

Published: 2025, Last Modified: 15 Jan 2026SIGIR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Unbiased learning to rank (ULTR), which utilizes historical click logs to train ranking models, has attracted much attention in the IR community. Previous studies on ULTR have focused on mitigating a variety of biases in click logs, such as position bias, trust bias, and presentation bias, to recover the true relevance of the query-document pairs. However, they overlooked the intrinsic distribution shifts between the training data and test data. In this paper, we first validate and analyze the distribution shift problem with a real-world ULTR dataset. To solve this problem, we propose distributionally robust unbiased learning to rank (DRO-ULTR) methods. Specifically, we design two kinds of group distributionally robust optimization (group-DRO) frameworks for the existing ULTR methods, one using the pointwise click prediction loss and the other using the listwise counterfactual ranking loss. Finally, we empirically verify the effectiveness of our DRO-ULTR methods by conducting extensive experiments on the real-world dataset.
Loading