Abstract: This paper presents a modular framework for character-coherent, emotion-aware role-playing dialogue with large language models (LLMs), centered on a novel Verifiable Emotion Reward (VER) objective. We introduce VER as a reinforcement-style signal derived from frozen emotion classifiers to provide both turn-level and dialogue-level alignment, effectively mitigating emotional drift across long interactions. To amplify VER’s benefits, we construct Character-Coherent Dialogues (CHARCO), a large-scale multi-turn dataset of over 230,000 dialogues, richly annotated with persona profiles, semantic contexts, and ten emotion labels. Our experiments show that fine-tuning LLMs on CHARCO significantly enhances VER’s impact, driving marked improvements in emotional consistency, role fidelity, and dialogue coherence. Through the evaluation that integrates lexical diversity metrics, automatic scoring with GPT-4, and human assessments, we demonstrate that the collaboration between a purpose-built multi-turn dataset and the VER objective leads to significant advancements in the field of persona-aligned conversational agents.
External IDs:doi:10.3390/info16090738
Loading