Distributed Structured Actor-Critic Reinforcement Learning for Universal Dialogue Management

Zhi Chen, Lu Chen, Xiaoyuan Liu, Kai Yu

2020 (modified: 08 Nov 2021)IEEE ACM Trans. Audio Speech Lang. Process. 2020Readers: Everyone

Abstract: Traditional dialogue policy needs to be trained independently for each dialogue task. In this work, we aim to solve a collection of independent dialogue tasks using a unified dialogue agent. The unified policy is parallelly trained using the conversation data from all of the distributed dialogue tasks. However, there are two key challenges:(1) the design of a unified dialogue model to adapt to different dialogue tasks; (2) finding a robust reinforcement learning method to keep the efficiency and the stability of the training process. Here we propose a novel structured actor-critic approach to implement structured deep reinforcement learning (DRL), which not only can learn parallelly from data of different dialogue tasks but also achieves stable and sample-efficient learning. We demonstrate the effectiveness of the proposed approach on 18 tasks of PyDial benchmark. The results show that our method is able to achieve state-of-the-art performance.

0 Replies