Keywords: multimodal dialogue systems, affective computing, knowledge grounding, marketing dialogue, emotion recognition, reinforcement learning, persuasive AI
Abstract: Despite recent progress in large language models (LLMs), most dialogue systems remain reactive and perform inadequately in emotionally nuanced, goal-oriented domains such as marketing conversations. We present AffectMind, a multimodal affective dialogue agent that enables proactive reasoning and dynamic knowledge grounding to sustain emotionally aligned and persuasive interactions. AffectMind integrates three components: a Proactive Knowledge Grounding Network that continuously updates factual and affective context from textual, visual, and prosodic signals; an Emotion-Intent Alignment Model that jointly infers user emotion and purchase intent to adapt persuasion strategies; and a Reinforced Discourse Loop that optimizes emotional coherence and long-term engagement via reinforcement learning from user feedback. Evaluations on two newly curated multimodal marketing dialogue benchmarks, MM-ConvMarket and AffectPromo, demonstrate that AffectMind significantly outperforms strong LLM-based baselines, achieving improvements of 26% in emotional consistency, 19% in persuasive success rate, and 23% in sustained user engagement. These results underscore emotion-grounded proactivity as a critical capability for next-generation commercial dialogue agents.
Paper Type: New Full Paper
Submission Number: 4
Loading