Language Models as Recommender Systems: Evaluations and Limitations

Yuhui Zhang; HAO DING; Zeren Shui; Yifei Ma; James Zou; Anoop Deoras; Hao Wang

Language Models as Recommender Systems: Evaluations and Limitations

Yuhui Zhang, HAO DING, Zeren Shui, Yifei Ma, James Zou, Anoop Deoras, Hao Wang

Published: 18 Oct 2021, Last Modified: 05 May 2023ICBINB@NeurIPS2021 PosterReaders: Everyone

Keywords: language model, prompt, recommender system

TL;DR: We use prompts to reformulate the session-based recommendation task to a multi-token cloze task and evaluate the proposed method on a movie recommendation dataset in zero-shot and fine-tuned settings.

Abstract: Pre-trained language models (PLMs) such as BERT and GPT learn general text representations and encode extensive world knowledge; thus, they can efficiently and accurately adapt to various downstream tasks. In this work, we propose to leverage these powerful PLMs as recommender systems and use prompts to reformulate the session-based recommendation task to a multi-token cloze task. We evaluate the proposed method on a movie recommendation dataset in zero-shot and fine-tuned settings where no or limited training data are available. In the zero-shot setting: we find that PLMs outperform the random recommendation baseline by a large margin; in the meantime, we observe strong linguistic bias when using PLMs as recommenders. In the fine-tuned setting: such bias is reduced with available training data; however, PLMs tend to under-perform traditional recommender system baselines such as GRU4Rec. Our observations demonstrate potential opportunities as well as current challenges in this novel direction.

Category: Negative result: I would like to share my insights and negative results on this topic with the community, Stuck paper: I hope to get ideas in this workshop that help me unstuck and improve this paper

1 Reply

Loading