Feb 12, 2018 (modified: Feb 13, 2018)ICLR 2018 Workshop Submissionreaders: everyone
Abstract:The training methods of sequence generator with a combination of GAN and policy gradient has shown good performance.
In this paper, we propose expert-based reward function training: the novel method to train sequence generator.
Different from previous studies of sequence generation, expert-based reward function training does not utilize GAN's framework.
Still, our model outperforms SeqGAN and a strong baseline, RankGAN.
TL;DR:This paper aims to learn a better metric for unsupervised learning, such as text generation, and shows a significant improvement over SeqGAN.