Transformers for Sequential Recommendation

Aleksandr V. Petrov; Craig Macdonald

Transformers for Sequential Recommendation

Aleksandr V. Petrov, Craig Macdonald

Published: 01 Jan 2024, Last Modified: 19 Jun 2024ECIR (5) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Sequential recommendation is a recommendation problem that aims to predict the next item in the sequence of user-item interactions. Sequential recommendation is similar to language modelling in terms of learning sequence structure; therefore, variants of the Transformer architecture, which has recently become mainstream in language modelling, also achieved state-of-the-art performance in sequential recommendation. However, despite similarities, training Transformers for recommendation models may be tricky: most recommendation datasets have their unique item sets, and therefore, the pre-training/finetuning approach, which is very successful for training language models, has limited applications for recommendations. Moreover, a typical recommender system has to work with millions of items, much larger than the vocabulary size of language models. In this tutorial, we cover adaptations of Transformers for sequential recommendation and techniques that help to mitigate the training challenges. The half-day (3 h + a break) tutorial consists of two sessions. The first session provides a background of the Transformer architecture and its adaptations to Recommendation scenarios. It covers classic Transformer-based models, such as SASRec and BERT4Rec, their architectures, training tasks and loss functions. In this session, we also discuss the specifics of training these models with large datasets and discuss negative sampling and the mitigation problem of the overconfidence problem caused by negative sampling. We also discuss the problem of the large item embedding tensor and the approaches to mitigate this problem, allowing training of the models even with very large item catalogues. In the second part of the tutorial, we focus specifically on modern generative transformer-based models for sequential recommendation. We discuss specifics of generative models for sequential recommending, such as item ID representation and recommendation list generation strategies. We also cover modern adaptations of large language models (LLMs) to recommender systems and discuss concrete examples, such as the P5 model. We conclude the session with our vision for the future development of the recommender systems field in the era of Large Language Models.

Loading