- Keywords: hashing, collaborative filtering, information retrieval, supervised learning
- TL;DR: We propose a new variational hashing-based collaborative filtering approach optimized for a novel self-mask variant of the Hamming distance, which outperforms state-of-the-art by up to 12% on NDCG.
- Abstract: Hashing-based collaborative filtering learns binary vector representations (hash codes) of users and items, such that recommendations can be computed very efficiently using the Hamming distance, which is simply the sum of differing bits between two hash codes. A problem with hashing-based collaborative filtering using the Hamming distance, is that each bit is equally weighted in the distance computation, but in practice some bits might encode more important properties than other bits, where the importance depends on the user. To this end, we propose an end-to-end trainable variational hashing-based collaborative filtering approach that uses the novel concept of self-masking: the user hash code acts as a mask on the items (using the Boolean AND operation), such that it learns to encode which bits are important to the user, rather than the user's preference towards the underlying item property that the bits represent. This allows a binary user-level importance weighting of each item without the need to store additional weights for each user. We experimentally evaluate our approach against state-of-the-art baselines on 4 datasets, and obtain significant gains of up to 12% in NDCG. We also make available an efficient implementation of self-masking, which experimentally yields <4% runtime overhead compared to the standard Hamming distance.
- Original Pdf: pdf