Bridging spherical mixture distributions and word semantic knowledge for Neural Topic Modeling

Published: 01 Jan 2024, Last Modified: 17 Apr 2025Expert Syst. Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Neural Topic Modeling has attracted significant attention from the Natural Language Processing community due to its black-box inference property and has made some progress. However, existing approaches have three main drawbacks: (1). hard to integrate external semantic knowledge to improve topic quality; (2). the posterior collapse issue of the Variational Auto-Encoder (VAE) harms the topic diversity; (3). cannot capture the semantic scopes of extracted topics. To address these limitations jointly, we model topics with a mixture of von Mises-Fisher distributions to incorporate word semantic knowledge and propose the Spherical Embedded Topic (SET) model based on the Wasserstein Auto-Encoder. Experiments have been conducted on three widely used text corpora, and the results show that SET obtains more coherent and diverse topics, outperforming the state-of-the-art approaches. Moreover, the ability of SET to capture semantic scopes of topics is also validated by our experiments.
Loading