Improving Text Models with Latent Feature Vector Representations

Huaijin Peng, Jing Wang, Qiwei Shen

Published: 2019, Last Modified: 13 Nov 2024ICSC 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Probabilistic topic models are widely used to discover potential topics in a collection of documents, while latent feature vector representations have been used to achieve high performance in many NLP tasks. In this paper, we first make document topic vector representations by combining LDA and Topic2Vec, and then we perform document representations based on the topic vectors and the document vectors obtained through Doc2Vec training. Experimental results show that our new model has produced significant improvements in topic consistency and document classification tasks.