Abstract: LLM-based Generative recommendation has attracted significant attention. However, in contrast to standard NLP tasks that inherently operate on human vocabulary, current generative recommendation approaches struggle to effectively encode items within the text-to-text framework. Due to this issue, the true potential of LLM-based generative recommendation remains largely unexplored. To better align LLMs with recommendation needs, we propose IDGenRec, representing each item as a unique, concise, semantically rich, platform-agnostic textual ID using human language tokens. This is achieved by training a textual ID generator alongside the LLM-based recommender, enabling seamless integration of personalized recommendations into natural language generation. Notably, as user history is expressed in natural language and decoupled from the original dataset, our approach suggests the potential for a foundational generative recommendation model.Experiments show that our framework consistently surpasses existing models in sequential recommendation under standard experimental setting. Then, we train a foundation recommendation model on a collected fusion dataset and tested its recommendation performance on 6 unseen datasets across different platforms under a completely zero-shot setting. The results show that the zero-shot performance of the pre-trained model is comparable to or even better than some traditional recommendation models based on supervised training, showing the potential of the IDGenRec paradigm serving as the foundation model for generative recommendation. Code and data are open-sourced at https://github.com/agiresearch/IDGenRec.
Loading