Abstract: Topic models have been prevalent because topics capture the hidden topics of documents and words are the reflection of topics appearing in documents. However, current topic models fail to provide explicit representation of the relationship between topics. We address this problem by proposing a new topic model, Multi-Dimension Latent Dirichlet Allocation (MDLDA). Our algorithm models the topics together with their mutual relationship. The model describes the relationship between topics as a mixture of words as topics in traditional topic models. The topics extracted by our model can better represent the documents in multiple classic tasks such as document classification. We conduct a case study to explain the model. Quantitative experiments on different datasets show improvements regarding modeling ability and performance in tasks as clustering and classification.
Loading