ADC: Advanced document clustering using contextualized representations

Published: 2019, Last Modified: 01 Oct 2024Expert Syst. Appl. 2019EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A document clustering framework that leverages contextualized vectors is proposed.•Informative representations for documents are extracted from pre-trained models.•A partial optimization and centroid update is proposed in the clustering module.•The proposed method outperforms the baselines in several datasets for clustering.•The effect of clustering method and embeddings are explored in various experiments.
Loading