Can Explanations Improve Recommendations? A Joint Optimization with LLM Reasoning

Yuyan Wang; Pan Li; Minmin Chen

Can Explanations Improve Recommendations? A Joint Optimization with LLM Reasoning

Yuyan Wang, Pan Li, Minmin Chen

Published: 16 Oct 2025, Last Modified: 10 Nov 2025NeurIPS 2025 ER WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Recommender Systems, LLM Reasoning, Reinforcement Learning

TL;DR: We proposed RecPIE, a framework that jointly optimizes recommendations and explanations, showing that the two tasks can reinforce each other when trained together, unlike prior approaches that treat them separately.

Abstract: Explanations in recommender systems have long been valued in business applications for enabling consumers to make informed decisions and providing firms with actionable insights. However, existing approaches either explain the data, which is not directly tied to the model, or explain a trained model in an ad hoc manner. Neither demonstrates that explanations can improve recommendation performance. Recent advances in large language models (LLMs) show that reasoning can improve LLM performance, but LLM-based recommenders still \emph{underperform} deep neural networks (DNNs). We propose \textbf{RecPIE}, \emph{Recommendation with Prediction-Informed Explanations}, a framework that jointly optimizes recommendations and explanations by learning what LLM-generated explanations are most useful for recommendations. For \textbf{prediction-informed explanations}, the recommendation task guides the learning of consumer embeddings, which serve as soft prompts to fine-tune LLMs to generate contrastive explanations (why a consumer may or may not like a product). For \textbf{explanation-informed predictions}, these learned explanations are then fed back into the recommendation component to improve predictive accuracy. The two tasks are trained in an alternating fashion, with the LLM continuously fine-tuned via proximal policy optimization (PPO). Extensive experiments on multiple industrial datasets show that RecPIE significantly outperforms strong baselines, achieving 3–34\% gains in predictive performance. Further analysis reveals that these gains mainly come from LLMs’ \emph{reasoning} capabilities, rather than their external knowledge or summarization skills.

Submission Number: 26

Loading