Expert-based reward function training: the novel method to train sequence generators

Joji Toyama; Yusuke Iwasawa; Kotaro Nakayama; Yutaka Matsuo

Expert-based reward function training: the novel method to train sequence generators

Joji Toyama, Yusuke Iwasawa, Kotaro Nakayama, Yutaka Matsuo

12 Feb 2018 (modified: 05 May 2023)ICLR 2018 Workshop SubmissionReaders: Everyone

Abstract: The training methods of sequence generator with a combination of GAN and policy gradient has shown good performance. In this paper, we propose expert-based reward function training: the novel method to train sequence generator. Different from previous studies of sequence generation, expert-based reward function training does not utilize GAN's framework. Still, our model outperforms SeqGAN and a strong baseline, RankGAN.

TL;DR: This paper aims to learn a better metric for unsupervised learning, such as text generation, and shows a significant improvement over SeqGAN.

Keywords: sequence generation, reinforcement learning, unsupervised learning, RNN

4 Replies

Loading