Large Language Models Are Not Stable Recommender Systems: A Position Bias Perspective

Published: 2025, Last Modified: 21 Jan 2026KSEM (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recommender Systems (RS) faces challenges in capturing complex user preferences requiring extensive knowledge, and the integration of Large Language Models (LLMs) has emerged as a promising paradigm. However, this deviation from traditional RS, which directly ranks items, to a paradigm requiring the transformation of items into sequence prompts for LLMs, presents a significant challenge. This shift highlights the sensitivity of LLMs to the order of items in prompts, thus introducing a critical issue of unstable performance due to the position bias in LLM-based Recommender Systems (LLMRS). In this study, we first delve into the position bias problem within LLMRS, identifying distinct patterns and observations. Furthermore, to mitigate this bias, we propose a two-stage debiasing framework, namely STELLA. Initially, in the probing stage, STELLA estimates position bias by analyzing the performance of LLMRS using a probing set. Then, insights from this stage are applied to debias subsequent samples through confidence estimation and iterative updating in the recommendation stage. Our experiments on three real-world datasets demonstrate that STELLA achieves superior recommendation performance while significantly improving stability across multiple domains and LLMs compared to existing baselines.
Loading