Hierarchical Topic Models for Expanding Category HierarchiesDownload PDFOpen Website

2019 (modified: 05 Nov 2021)BigComp 2019Readers: Everyone
Abstract: Category hierarchies often help users efficiently find information they need. When newly arrived documents convey new concepts that are not well covered in a hierarchy, it may be appropriate to split an existing category or insert a new category, to which related documents are relocated, in the hierarchy. However, it is often difficult to decide whether to create a new category or not (which we call the category-expansion problem). To address this problem, we propose a novel hierarchical topic model, which we call Generalized SSHLDA (G-SSHLDA). This model can insert a latent subtree at any level in an observed hierarchy. In the latent subtree, its root node can be an arbitrary observed node, and the other nodes are latent nodes. One of the nodes in the latent subtree is linked up with a deeper-level observed node. On the basis of these ideas, G-SSHLDA can automatically expand a category hierarchy that is associated with the target data collection. We demonstrate through experiments with two real-world datasets that G-SSHLDA effectively addresses the category-expansion problem.
0 Replies

Loading