- TL;DR: We introduce a novel larger-context language model to simultaneously captures syntax and semantics, making it capable of generating highly interpretable sentences and paragraphs
- Abstract: To simultaneously capture syntax and semantics from a text corpus, we propose a new larger-context language model that extracts recurrent hierarchical semantic structure via a dynamic deep topic model to guide natural language generation. Moving beyond a conventional language model that ignores long-range word dependencies and sentence order, the proposed model captures not only intra-sentence word dependencies, but also temporal transitions between sentences and inter-sentence topic dependences. For inference, we develop a hybrid of stochastic-gradient MCMC and recurrent autoencoding variational Bayes. Experimental results on a variety of real-world text corpora demonstrate that the proposed model not only outperforms state-of-the-art larger-context language models, but also learns interpretable recurrent multilayer topics and generates diverse sentences and paragraphs that are syntactically correct and semantically coherent.
- Code: https://drop.me/BbR8pr
- Keywords: Bayesian deep learning, recurrent gamma belief net, larger-context language model, variational inference, sentence generation, paragraph generation
- Original Pdf: pdf