Microblog Sentiment Topic ModelDownload PDFOpen Website

2016 (modified: 14 Oct 2021)ICDM Workshops 2016Readers: Everyone
Abstract: With the prevalence of social media, such as Twitter, short-length text like microblogs have become an important mode of text on the Internet. In contrast to other forms of media, such as newspaper, the text in these social media posts usually contains fewer words, and is concentrated on a much narrower selection of topics. For these reasons, traditional LDA-based sentiment and topic modeling techniques generally do not work well in case of social media data. Another characteristic feature of this data is the use of special meta tokens, such as hashtags, which contain unique semantic meanings that are not captured by other ordinary words. In the recent years, many topic modeling techniques have been proposed for social media data, but the majority of this work does not take into account the specialty of tokens, such as hashtags, and treats them as ordinary words. In this paper, we propose probabilistic graphical models to address the problem of discovering latent topics and their sentiment from social media data, mainly microblogs like Twitter. We first propose MTM (Microblog Topic Model), a generative model that assumes each social media post generates from a single topic, and models both words and hashtags separately. We then propose MSTM (Microblog Sentiment Topic Model), an extension of MTM, which also embodies the sentiment associated with the topics. We evaluated our models using Twitter dataset, and experimental results show that our models outperform the existing techniques.
0 Replies

Loading