Abstract: Traditional dialogue policy needs to be trained independently for each dialogue task. In this work, we aim to solve a collection of independent dialogue tasks using a unified dialogue agent. The unified policy is parallelly trained using the conversation data from all of the distributed dialogue tasks. However, there are two key challenges:(1) the design of a unified dialogue model to adapt to different dialogue tasks; (2) finding a robust reinforcement learning method to keep the efficiency and the stability of the training process. Here we propose a novel structured actor-critic approach to implement structured deep reinforcement learning (DRL), which not only can learn parallelly from data of different dialogue tasks but also achieves stable and sample-efficient learning. We demonstrate the effectiveness of the proposed approach on 18 tasks of PyDial benchmark. The results show that our method is able to achieve state-of-the-art performance.
0 Replies
Loading