Learn to Encode Text as Comprehensible Summary by Generative Adversarial Network


Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Auto-encoders compress the input data into latent-space representation and reconstruct the original data from the representation. The latent representation cannot be easily interpreted by human. In this paper, we propose a new idea to train an auto-encoder that encodes input text into comprehensible sentences. The auto-encoder is composed of a generator and a reconstructor. The generator encodes the input text into a shorter word sequence, and the reconstructor recovers the input of generator from the output of generator. To make the generator output comprehensible by human, a discriminator restricts the output of the generator to look like the summaries written by human. By taking the generator output as the summary of the input text, abstractive summarization can be achieved without document-summery pairs as training data. Promising results were obtained on both English and Chinese corpora.