Abstract: Rank-integrated topic models which incorporate link structures into topic modeling through topical ranking have shown promising performance comparing to other link combined topic models. However, existing work on rank-integrated topic modeling treats ranking as document distribution for topic, and therefore can’t integrate topical ranking with LDA model, which is one of the most popular topic models. In this paper, we introduce a new method to integrate topical ranking with topic modeling and propose a general framework for topic modeling of documents with link structures. By interpreting the normalized topical ranking score vectors as topic distributions for documents, we fuse ranking into topic modeling in a general framework. Under this general framework, we construct two rank-integrated PLSA models and two rank-integrated LDA models, and present the corresponding learning algorithms. We apply our models on four real datasets and compare them with baseline topic models and the state-of-the-art link combined topic models in generalization performance, document classification, document clustering and topic interpretability. Experiments show that all rank-integrated topic models perform better than baseline models, and rank-integrated LDA models outperform all the compared models.
0 Replies
Loading