Factual and Tailored Recommendation Endorsements using Language Models and Reinforcement Learning

Published: 10 Jul 2024, Last Modified: 26 Aug 2024COLMEveryoneRevisionsBibTeXCC BY 4.0
Research Area: Alignment, LMs on diverse modalities and novel applications
Keywords: Large language model, reinforcement learning, conversational recommender systems, recommender systems
TL;DR: Develop a conversational recommender system that uses natural language and reinforcement learning to make precise, appealing, and personalized item recommendations based on user preferences, showing promising results on major datasets.
Abstract: Recommender systems (RSs) play a central role in matching candidate items to users based on their preferences. While traditional RSs rely on user feed-back signals, conversational RSs interact with users in natural language. In this work, we develop P4LM, an _aPpealing, Precise, Preference-comprehensive and Prioritized_ language model which endorses recommended items by emphasizing specific item characteristics and their coverage to a user’s preferences. P4LM uses an _embedding_ representation of a user’s preferences to generate responses that are appealing, factually-grounded and tailored to the user’s preferences. P4LM employs a joint reward function to measure precision, appeal, preference coverage and prioritization of preferences, which are used as AI-based feedback in a reinforcement learning-based language model framework. On the MovieLens 25M and Amazon Product Review datasets, P4LM delivers more appealing and tailored endorsements to users, as determined by auto-critic and rater evaluations.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 405
Loading