ENSUR: Equitable and Statistically Unbiased Recommendation

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Although Recommender Systems (RS) have been well-developed for various fields of applications, they often suffer from a crisis of platform credibility with respect to RS confidence and fairness, which may drive users away, threatening the platform's long-term success. In recent years, some works have tried to solve these issues; however, they lack strong statistical guarantees. Therefore, there is an urgent need to solve both issues with a unifying framework with robust statistical guarantees. In this paper, we propose a novel and reliable framework called Equitable and Statistically Unbiased Recommendation (ENSUR)) to dynamically generate prediction sets for users across various groups, which are guaranteed 1) to include ground-truth items with user-predefined high confidence/probability (e.g., 90\%); 2) to ensure user fairness across different groups; 3) to have minimum efficient average prediction set sizes. We further design an efficient algorithm named Guaranteed User Fairness Algorithm (GUFA) to optimize the proposed method and derive upper bounds of risk and fairness metrics to speed up the optimization process. Moreover, we provide rigorous theoretical analysis concerning risk and fairness control and minimum set size. Extensive experiments validate the effectiveness of the proposed framework, which aligns with our theoretical analysis.
Lay Summary: Recommender systems suggest products, movies, or songs by learning from what we click on. Yet there is always a lingering question $\textit{``Whether these suggestions can be trusted or are they fair to everyone with certainty?''}$ Despite recent advancements, the systems rarely answer either question with mathematical guarantees. Our work introduces $\textbf{ENSUR}$, a simple add-on that wraps around any recommender model to deliver three simultaneous guarantees: a) for each user, the recommended set is almost certain (e.g., ≥ 90\%) to contain at least one item they will truly like. b) the system meets the same accuracy target for both advantaged and disadvantaged user groups (for instance, across genders or regions). and finally, c) the recommended list is kept short and focused, so users are not overwhelmed by incorrect predictions. The framework does this by treating recommendations as a game of "set prediction". A new greedy search procedure, $\textbf{GUFA}$, quickly tunes the size of each user’s list until it meets the desired confidence and fairness thresholds, all backed by statistical theory. In large-scale tests on e-commerce, movie, and music data, ENSUR consistently produced more focused, fairer, and reliable recommendation lists than existing approaches, without retraining the underlying model with negligible runtime overhead.
Primary Area: General Machine Learning->Supervised Learning
Keywords: Dynamic Prediction Sets
Submission Number: 904
Loading