Hypothesis Testing for Generalized Thurstone Models

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In this work, we develop a hypothesis testing framework to determine whether pairwise comparison data is generated by an underlying *generalized Thurstone model* $\mathcal{T}_F$ for a given choice function $F$. While prior work has predominantly focused on parameter estimation and uncertainty quantification for such models, we address the fundamental problem of minimax hypothesis testing for $\mathcal{T}_F$ models. We formulate this testing problem by introducing a notion of separation distance between general pairwise comparison models and the class of $\mathcal{T}_F$ models. We then derive upper and lower bounds on the critical threshold for testing that depend on the topology of the observation graph. For the special case of complete observation graphs, this threshold scales as $\Theta((nk)^{-1/2})$, where $n$ is the number of agents and $k$ is the number of comparisons per pair. Furthermore, we propose a hypothesis test based on our separation distance, construct confidence intervals, establish time-uniform bounds on the probabilities of type I and II errors using reverse martingale techniques, and derive minimax lower bounds using information-theoretic methods. Finally, we validate our results through experiments on synthetic and real-world datasets.
Lay Summary: When people or systems compare two items—like players in a game or products in a survey—it is often assumed that these choices follow a specific kind of model based on hidden “scores” or preferences. One popular family of such models is known as the generalized Thurstone model, where each item has a score and the choice depends on how much higher one score is compared to another. Our work tackles the following fundamental question: Given only the outcomes of several such comparisons, how can we tell if the choices actually obey a generalized Thurstone model? To address this question, we propose a data-driven approach to test whether choices obey a generalized Thurstone model by using and developing tools from machine learning and statistics. We provide both rigorous mathematical guarantees and real-world experiments in our analysis. For example, we show that our approach is “optimal” in a certain sense under appropriate analytical conditions. Our results can be of utility to practitioners who either seek to test whether their modeling assumptions hold or select accurate models for comparison data.
Primary Area: Theory->Learning Theory
Keywords: generalized Thurstone model, hypothesis testing, minimax risk, cycle decomposition.
Submission Number: 10511
Loading