Abstract: In this paper, we propose an adaptive Dirichlet Multinomial Mixture model for short text clustering along the time slices. A hyperparameters adjusting algorithm is utilized to capture the temporal dynamics automatically, and a collapsed Gibbs sampling algorithm for the extended Dirichlet Multinomial Mixture (DMM) model (e-GSDMM algorithm), is proposed to infer the changes of topic and word distributions along the time slices. Our extensive experiments over three different datasets show that the proposed model is efficient and performs better than the existing GSDMM approach for short text clustering on the streaming data.
0 Replies
Loading