Semi-supervised Semantic Visualization for Networked DocumentsOpen Website

2021 (modified: 13 Jan 2022)ECML/PKDD (3) 2021Readers: Everyone
Abstract: Semantic interpretability and visual expressivity are important objectives in exploratory analysis of text. On the one hand, while some documents may have explicit categories, we could develop a better understanding of a corpus by studying its finer-grained structures, which may be latent. By inferring latent topics and discovering keywords associated with each topic, one obtains a semantic interpretation of the corpus. One the other hand, by visualizing documents, latent topics, and category labels on the same plot, one gains a bird’s eye view of the relationships among documents, topics, and various categories. Semantic visualization is a class of methods that unify both topic modeling and visualization. In this paper, we propose a novel semantic visualization model for networked documents that incorporates partial labels. We introduce coordinate-based label distribution and label-dependent topic distribution to visualize documents, topics, and labels in a semi-supervised way. We further derive three variants for singly-labeled, multi-labeled, and hierarchically-labeled documents. The focus on semi-supervision that employs variants of labeling structures is particularly novel. Experiments verify the efficacy of our model against baselines.
0 Replies

Loading