Complex-Target-Guided Open-Domain Conversation based on offline reinforcement learningDownload PDF


22 Sept 2022, 12:33 (modified: 26 Oct 2022, 14:03)ICLR 2023 Conference Blind SubmissionReaders: Everyone
Keywords: target-guided dialogue, offline RL
Abstract: Previous target-guided open-domain dialogue systems mostly take one keyword as the target, which has great limitations and cannot characterize the dialogue target well. In this paper, we introduce a new target representation model which uses a verb-noun pair to represent a complex-target. To this end, we implement a new dialogue guide procedure with Verb graph and Noun graph construction, dialogue encoder, verb-noun choose model and response generator. Machine metrics and human evaluation both show that our model outperforms previous target-guided dialogue system. In addition, different from previous target-guided dialogue systems which use online reinforcement learning to make decisions, we integrate an offline reinforcement learning method to gradually reduce the training time with a high performance.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
4 Replies