Improving Neural Topic Models by Contrastive Learning with BERT

Anonymous

Improving Neural Topic Models by Contrastive Learning with BERT

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone

Abstract: We present a general plug-and-play contrastive learning framework that improves existing neural topic models (NTMs) by incorporating the knowledge distilled from pre-trained language models. Recent NTMs have been applied to many applications and shown promising improvement on text analysis. However, they mainly focus on word-occurrences and are often optimized by maximizing the likelihood-based objective, which could lead to suboptimal topic coherence and document representation. To overcome the above bottleneck, we introduce an additional contrastive loss that pushes the topical representation of a document learned by an NTM close to the semantic representation of the document obtained from pre-trained language models. In this way, the prior knowledge of the pre-trained language models can enrich the contextual information of the target corpus for NTMs. Comprehensive experiments show that the proposed framework achieve the state-of-the-art performance. Importantly, our framework is general approach to improve most of the existing NTMs.

0 Replies

Loading