Is Your LLM-as-a-Recommender Agent Trustable? LLMs' Recommendation is Easily Hacked by Biases (Preferences)

Is Your LLM-as-a-Recommender Agent Trustable? LLMs' Recommendation is Easily Hacked by Biases (Preferences)

ACL ARR 2026 January Submission5573 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM-as-a-Recommender, Autonomous Agents, Cognitive Bias Robustness

Abstract: Current Large Language Models (LLMs) are gradually exploited in practically valuable agentic workflows such as Deep Research, E-commerce recommendation, job recruitment, or AI Scientist. In these applications, LLMs need to select some optimal solutions from massive candidate pools, which can be defined as \textit{LLM-as-a-Recommender} paradigm. However, the reliability of using LLM agents to conduct recommendations is underexplored. In this work, we introduce a \textbf{Bias} \textbf{Rec}ommendation \textbf{Bench}mark (\textbf{BiasRecBench}) to highlight the critical vulnerability of such agents to biases in high-value real-world tasks. The benchmark includes three practical domains: paper review, e-commerce, and job recruitment. To measure the option quality effect, we construct a \textsc{Bias Synthesis Pipeline with Calibrated Quality Margins} that synthesizes evaluation data by strictly controlling the quality gap between optimal and sub-optimal options. To enhance the bias intensity, we propose contextual biases to make biases more logical and suitable for option contexts. Extensive experiments on state-of-the-art models (Gemini-2.5-pro, GPT-4o, DeepSeek-R1) reveal that agents frequently succumb to injected biases despite having sufficient reasoning capabilities to identify the ground truth. These findings expose a significant reliability bottleneck in current agentic workflows, calling for specialized alignment strategies for LLM-as-a-Recommender.

Paper Type: Long

Research Area: AI/LLM Agents

Research Area Keywords: LLM agents, Autonomous agents, safety and alignment for agents, model bias evaluation

Contribution Types: Model analysis & interpretability, Data resources, Data analysis

Languages Studied: English

Submission Number: 5573

Loading