Learning Document Embeddings With CNNs

Shunan Zhao, Chundi Lui, Maksims Volkovs

Feb 15, 2018 (modified: Feb 15, 2018) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: This paper proposes a new model for document embedding. Existing approaches either require complex inference or use recurrent neural networks that are difficult to parallelize. We take a different route and use recent advances in language modeling to develop a convolutional neural network embedding model. This allows us to train deeper architectures that are fully parallelizable. Stacking layers together increases the receptive filed allowing each successive layer to model increasingly longer range semantic dependences within the document. Empirically we demonstrate superior results on two publicly available benchmarks. Full code will be released with the final version of this paper.
  • TL;DR: Convolutional neural network model for unsupervised document embedding.
  • Keywords: unsupervised embedding, convolutional neural network