A probabilistic perspective on nearest neighbor for implicit recommendation

Domokos M. Kelen, Andras A Benczur

Published: 07 Dec 2021, Last Modified: 30 Sept 2024ICDMEveryoneCC BY 4.0

Abstract: Over the past years, the recommender systems community invented several novel approaches that reached better and better prediction accuracy. Remarkably, however, the classic k nearest neighbor algorithm appears to remain competitive, even when compared to much more sophisticated methods. In this paper, we attempt to explain the inner workings of the nearest neighbor using probabilistic tools, treating similarity as conditional probability and presenting a novel model for explaining and removing popularity bias.First, we provide a probabilistic formulation of similarity and the classic prediction formula. Second, by modeling user behavior as a combination of personal preference and global influence, we are able to explain the presence of popularity bias in the predictions. Finally, we utilize Bayesian inference to construct a theoretically grounded variant of the widely used inverse frequency scaling