Abstract: With the rapid growth of geo-tagged social media data, it has become feasible to explore topics across different areas through text mining and geographical visualization. However, the visual elements of social media data always overlap with each other in the map view, which largely disturbs visual perception of semantic features and their geographical distribution. Thus, it is of great significance to reduce the visual clutter of large-scale social media data, and enhance the visibility of semantic features across local areas. In this paper, we utilize a doc2vec model to transform geo-tagged social media data into high-dimensional vectors, and the semantic correlation can be easily characterized in the dimensionality reduction space. Aiming at the reduction of visual clutter of geographical visualization, a dual-objective blue noise sampling model is proposed to select a subset of social media data, by means of which both the semantic correlation and spatial distribution of large scale social media data are well retained. A rich set of visual designs are implemented enabling users to evaluate the sampled results from multiple perspectives and explore the changes of semantic features across areas, such as heatmap, word cloud and text stream. The effectiveness and validity of the proposed visualization system are further demonstrated through case studies and expert reviews.
Loading