Workflow-Guided Response Generation for Task-Oriented Dialogue

Anonymous

Workflow-Guided Response Generation for Task-Oriented Dialogue

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: Task-oriented dialogue (TOD) systems aim to achieve specific goals through interactive dialogue. Such tasks usually involve following specific workflows, i.e. executing a sequence of actions in a particular order. While prior work has focused on supervised learning methods to condition on past actions, they do not explicitly optimize for compliance to a desired workflow. In this paper, we propose a novel framework based on reinforcement learning (RL) to generate dialogue responses that are aligned with a given workflow. Our framework consists of ComplianceReward, a metric designed to evaluate how well a generated response executes the specified action, combined with an RL optimization process that utilizes an interactive sampling technique. We evaluate our approach on two TOD datasets, Action-Based Conversations Dataset (ABCD) (Chen et al., 2021) and MultiWOZ 2.2 (Zang et al., 2020) on a range of automated and human evaluation metrics. Our findings indicate that our RL-based framework outperforms baselines and is effective at generating responses that both comply with the intended workflows while being expressed in a natural and fluent manner.

Paper Type: long

Research Area: Dialogue and Interactive Systems

Contribution Types: NLP engineering experiment

Languages Studied: English

0 Replies

Loading