Stop Playing the Guessing Game! Evaluating Conversational Recommender Systems via Target-free User Simulation

Stop Playing the Guessing Game! Evaluating Conversational Recommender Systems via Target-free User Simulation

ACL ARR 2025 May Submission7872 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Recent developments in Conversational Recommender Systems (CRSs) have focused on simulating real-world interactions between users and CRSs to create more realistic evaluation environments. Despite considerable advancements, reliably assessing the capability of CRSs in eliciting user preferences remains a significant challenge. We observe that user-CRS interactions in existing evaluation protocols resemble a guessing game, as they construct target-biased simulators pre-encoded with target item knowledge, thereby allowing the CRS to shortcut the elicitation process. Moreover, we reveal that current evaluation metrics, which predominantly emphasize single-turn recall of target items, suffer from target ambiguity in multi-turn settings and overlook the intermediate process of preference elicitation. To address these issues, we introduce PEPPER, a novel CRS evaluation protocol with target-free user simulators that enable users to gradually discover their preferences through enriched interactions, along with detailed measures for comprehensively assessing the preference elicitation capabilities of CRSs. Through extensive experiments, we validate PEPPER as a reliable simulation environment and offer a thorough analysis of how effectively current CRSs perform in preference elicitation and recommendation.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: conversational modeling, conversational recommender systems, user simulator

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 7872

Loading