PGeoTopic: A Distributed Solution for Mining Geographical Topic ModelsDownload PDFOpen Website

2022 (modified: 28 Jan 2023)IEEE Trans. Knowl. Data Eng. 2022Readers: Everyone
Abstract: Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the efficiency issue. However, it is very expensive to train geographical topic models — it may take days to train a geographical topic model of a small scale on a collection of documents with millions of word tokens. In this paper, we propose the first distributed solution, called <inline-formula><tex-math notation="LaTeX">${\sf PGeoTopic}$</tex-math></inline-formula> , for training geographical topic models. The proposed solution comprises several novel technical components to increase parallelism, reduce memory requirement, and reduce communication cost. Experiments show that our approach for mining geographical topic models is scalable with both model size and data size on distributed systems.
0 Replies

Loading