ONLINE RANKING WITH UNFAIR FEEDBACK AND HUMAN VERIFICATION: HIERARCHICAL ELIMINATION AND REGRET BOUNDS
Keywords: Online learning, Ranking, Queueing System
Abstract: Online platforms rely heavily on user feedback for ranking systems, such as restaurant ratings and e-commerce listings. However, these systems face challenges from unfair feedback, including merchant-induced and malicious feedback. Thus, platforms have adopted human verification to increase the reliability of the rankings. It can certainly identify genuine feedback, but introduces high latency into real-time updates, leading to non-static queuing dynamics and creating challenges for online learning. We model this as a continuous-time online learning problem, establish the lower bound on regret, and propose two algorithms: Hierarchical Elimination (HE) and Deficit Hierarchical Elimination (DHE), dealing with the case of single and multiple verifiers, respectively. We further prove upper regret bounds for both algorithms and demonstrate their effectiveness through numerical experiments.
Primary Area: learning theory
Submission Number: 22513
Loading