Abstract: Detecting events by using social media has been an active research problem. In this work, we investigate and compare the performance of two methods for event detection in Twitter by using Apache Storm as the stream processing infrastructure. The first event detection method is based on identifying uncommonly common words inside tweet blocks, and the second one is based on clustering tweets to detect a cluster as an event. Each of the methods has its own characteristics. Uncommonly common word based method relies on the burst of words and hence is not affected from concurrency problems in distributed environment. On the other hand, clustering based method includes a finer grained analysis, but it is sensitive to the concurrent processing. We investigate the effect of stream processing and concurrency handling support provided by Apace Storm on event detection by these methods.
Loading