Abstract: Collaborative prediction involves filling in missing entries of a user-item matrix to predict preferences of users based on their observed preferences. Most of existing models assume that the data is missing at random (MAR), which is often violated in recommender systems in practice. Incorrect assumption on missing data ignores the missing data mechanism, leading to biased inferences and prediction. In this paper we present a Bayesian binomial mixture model for collaborative prediction, where the generative process for data and missing data mechanism are jointly modeled to handle non-random missing data. Missing data mechanism is modeled by three factors, each of which is related to users, items, and rating values. Each factor is modeled by Bernoulli random variable, and the observation of rating value is determined by the Boolean OR operation of three binary variables. We develop computationally-efficient variational inference algorithms, where variational parameters have closed-form update rules and the computational complexity depends on the number of observed ratings, instead of the size of the rating data matrix. We also discuss implementation issues on hyperparameter tuning and estimation based on empirical Bayes. Experiments on Yahoo! Music and MovieLens datasets confirm the useful behavior of our model by demonstrating that: (1) it outperforms state-of-the-art methods in yielding higher predictive performance; (2) it finds meaningful solutions instead of undesirable boundary solutions; (3) it provides rating trend analysis on why ratings are observed.
0 Replies
Loading