Keywords: music recommendation, sequential modeling, personalization, recommendation systems
TL;DR: Using offline analyses, we show that using transformer decoders lead to better music recommendations than neural item-item models
Abstract: Music recommendation systems face the dual challenge of capturing both immediate context and long-term preferences in users' listening patterns. We adapt a generalized sequential model architecture for music recommendation, introducing modifications that acknowledge how music preferences combine temporal patterns and stable tastes. By removing causal masking constraints typically used in sequential models, we better capture users' overall preferences rather than strictly sequential patterns. This technique achieves approximately 28% improvement in F1 scores compared to a neural item-item baseline. Through ablation studies, we show that using positional encoding and removing the causal mask during training results in the best personalized recommendations. Our findings demonstrate that transformer-based architectures can effectively model music preferences while being computationally efficient for large-scale deployment.
Track: Paper Track
Confirmation: Paper Track: I confirm that I have followed the formatting guideline and anonymized my submission.
(Optional) Short Video Recording Link: https://drive.google.com/file/d/1jqzhVCWTYJomM7vgOZ86PqbPVGo4PkaK/view?usp=drive_link
Submission Number: 59
Loading