Ranking with Multiple Oracles: From Weak to Strong Stochastic Transitivity

Tao Jin; Yue Wu; Quanquan Gu; Farzad Farnoud

Ranking with Multiple Oracles: From Weak to Strong Stochastic Transitivity

Tao Jin, Yue Wu, Quanquan Gu, Farzad Farnoud

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Abstract: We study the problem of efficiently aggregating the preferences of items from multiple information sources (oracles) and infer the ranking under both the weak stochastic transitivity (WST) and the strong stochastic transitivity (SST) conditions. When the underlying preference model satisfies the WST condition, we propose an algorithm named RMO-WST, which has a bi-level design: at the higher level, it actively allocates comparison budgets to all undetermined pairs until the full ranking is recovered; at the lower level, it attempts to compare the pair of items and selects the more accurate oracles simultaneously. We prove that the sample complexity of RMO-WST is $ \tilde O( N\sum_{i=2}^{N}H_{\sigma^{-1}(i),{\sigma^{-1}(i-1)}} )$, where $N$ is the number of items to rank, $H$ is a problem-dependent hardness factor, and $\sigma^{-1}(i)$ represents the $i$-th best item. We also provide a tight lower bound that matches the upper bound of approximate ranking under the WST condition, answering a previously open problem. In addition, when the SST condition is satisfied, we propose an algorithm named RMO-SST, which can achieve an $\tilde{O}(\sum_{i=1}^{N} H_i \log(N))$ sample complexity. This outperforms the best-known sample complexity by a factor of $\log(N)$. The theoretical advantages of our algorithms are verified by empirical experiments in a simulated environment.

Lay Summary: We looked at how to accurately figure out the ranking of items (like products or choices) based on opinions from several different sources (which could be people, systems, or tools). These sources might not always be perfect, and their preferences might conflict or vary in quality. By looking at the consistency of the response from various data sources we can estimate the quality of a particular one. Thus in return we can save effort by only consulting the higher quality ones to get the ranking.

Primary Area: Theory->Online Learning and Bandits

Keywords: active ranking, stochastic transitivity, rank aggregation, dueling bandits, lower bound, upper bound

Submission Number: 14866

Loading