Variational learning to rank (VL2R)

Keld T. Lundgaard

Published: 01 Jan 2018, Last Modified: 13 May 2023RecSys 2018Readers: Everyone

Abstract: We present Variational Learning to Rank (VL2R), a combination of variational inference and learning to rank. The combination provides a natural way to balance exploration and exploitation of the algorithm by introducing shuffling of product search/category listings according to the model's relevance uncertainty for each product. Simply put, we perturb (newer) products with higher uncertainty on the relevance more than (older) products which have a lower uncertainty on the relevance. Our formalism makes it possible to train an end-to-end model that optimizes for both ranking and shuffling, compared to known state-of-the-art systems where ranking and shuffling are treated as separate problems. VL2R provides an integrated way of doing propensity scoring during the offline learning phase, thus reducing selection bias. The system is simple, yet powerful and flexible. We have implemented it within the Salesforce Commerce Cloud; a platform 500 million unique online shoppers interact with each month across 2,750 websites in 53+ countries as of FY18. In this talk, we will go into the details of our variational learning to rank system and share our early experiences with optimizing VL2R and running it in production. We hope that by sharing VL2R with the recommendation systems community, we will foster more research in this direction, and result in systems that are faster at learning user preferences for changing catalogs.

0 Replies