Breaking Flat: A Generalised Query Performance Prediction Evaluation Framework

Payel Santra, Partha Basuchowdhuri, Debasis Ganguly

Published: 2026, Last Modified: 29 Apr 2026ECIR (2) 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The traditional use-case of query performance prediction (QPP) is to identify which queries perform well and which perform poorly for a given ranking model. A more fine-grained—and arguably more challenging—extension of this task is to determine which ranking models are most effective for a given query. In this work, we generalize the QPP task and its evaluation into three settings: (i) Single-Ranker Multi-Query (SRMQ-PP), corresponding to the standard use-case; (ii) Multi-Ranker Single-Query (MRSQ-PP), which evaluates a QPP model’s ability to select the most effective ranker for a query; and (iii) Multi-Ranker Multi-Query (MRMQ-PP), which considers predictions jointly across all query–ranker pairs. Our results show that (a) the relative effectiveness of QPP models varies substantially across tasks (SRMQ-PP vs. MRSQ-PP), and (b) predicting the best ranker for a query is considerably more difficult than predicting the relative difficulty of queries for a given ranker.
Loading