Abstract: Recent unsupervised topic modelling approaches that use clustering techniques on word, token or document embeddings can extract coherent topics. However, a common limitation of such approaches is that they reveal nothing about inter-topic relationships which are essential in many real-world application domains. We present an unsupervised topic modelling method which harnesses Topological Data Analysis (TDA) to extract a topological skeleton of the manifold upon which contextualised word embeddings lie. We demonstrate that our approach, which performs on par with a recent baseline, is able to construct a network of coherent topics together with meaningful relationships between them.
Paper Type: long
0 Replies
Loading