Abstract: With the recent advances in Reinforcement Learning (RL), there have been tremendous interests in employing RL for recommender systems. However, directly training and evaluating a new RL-based recommendation algorithm needs to collect users’ real-time feedback in the real system, which is time/effort consuming and could negatively impact users’ experiences. Thus, it calls for a user simulator that can mimic real users’ behaviors to pre-train and evaluate new recommendation algorithms. Simulating users’ behaviors in a dynamic system faces immense challenges – (i) the underlying item distribution is complex, and (ii) historical logs for each user are limited. In this paper, we develop a user simulator based on a Generative Adversarial Network (GAN). To be specific, the generator captures the underlying distribution of users’ historical logs and generates realistic logs that can be considered as augmentations of real logs; while the discriminator not only distinguishes real and fake logs but also predicts users’ behaviors. The experimental results based on benchmark datasets demonstrate the effectiveness of the proposed simulator.
0 Replies
Loading