Autoencoding Documents for Topic Modeling with L-2 Sparsity Regularization

Anonymous

Oct 22, 2018 NIPS 2018 Workshop IRASL Blind Submission readers: everyone
  • Abstract: We propose a novel yet simple neural network architecture for topic modelling. The method is based on training an autoencoder structure where the bottleneck represents the space of the topics distribution and the decoder outputs represent the space of the words distributions over the topics. We exploit an auxiliary decoder to prevent mode collapsing in our model. A key feature for an effective topic modelling method is having sparse topics and words distributions, where there is a trade-off between the sparsity level of topics and words. This feature is implemented in our model by L-2 regularization and the model hyperparameters take care of the trade-off. We show in our experiments that our model achieves competitive results compared to the state-of-the-art deep models for topic modelling, despite its simple architecture and training procedure. The “New York Times” and “20 Newsgroups” datasets are used in the experiments.
  • TL;DR: A deep model for topic modelling
0 Replies

Loading