SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing Agents

ACL ARR 2024 December Submission179 Authors

10 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The advanced role-playing capabilities of Large Language Models (LLMs) have paved the way for developing Role-Playing Agents (RPAs). However, existing benchmarks in social interaction such as HPD and SocialBench have not investigated hallucination and face limitations like poor generalizability and implicit judgments for character fidelity. To address these issues, we propose a generalizable, explicit and effective paradigm to unlock the interactive patterns in diverse worldviews. Specifically, we define the interactive hallucination based on stance transfer and construct a benchmark, SHARP, by extracting relations from a general commonsense knowledge graph and leveraging the inherent hallucination properties of RPAs to simulate interactions across roles. Extensive experiments validate the effectiveness and stability of our paradigm. Our findings further explore the factors influencing these metrics and discuss the trade-off between blind loyalty to roles and adherence to facts in RPAs.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Hallucination, Dialogue interaction, Role-play Agents
Contribution Types: NLP engineering experiment, Data resources, Data analysis
Languages Studied: English, Chinese
Submission Number: 179
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview