Abstract: We propose a novel approach, referred to as contrastive disentangled representation for query performance prediction (CoDiR-QPP), to estimate search query performance by disentangling query content semantics from query difficulty. Our proposed approach leverages neural disentanglement to isolate the information need expressed in search queries from the complexities that affect retrieval performance. Motivated by empirical observations that varying query formulations for the same information need can significantly impact retrieval outcomes, we hypothesize that separating content semantics from query difficulty can enhance query performance prediction. Utilizing contrastive learning, CoDiR-QPP distinguishes between well-performing and poorly performing query variants, facilitating the estimation of a given query’s performance. Our extensive experiments on four standard benchmark datasets demonstrate that CoDiR-QPP outperforms state-of-the-art baselines in predicting query performance, offering improved semantic similarity computation and higher correlation metrics such as Kendall \(\tau\), Spearman \(\rho\), and scaled Mean Absolute Ranking Error (sMARE).
Loading