Abstract: The hierarchical structure inherent in many real-world datasets makes the modeling of such hierarchies a crucial objective in both unsupervised and supervised machine learning. While recent advancements have introduced deep architectures specifically designed for hierarchical clustering, we adopt a critical perspective on this line of research. Our findings reveal that these methods face significant limitations in scalability and performance when applied to realistic datasets.~Given these findings, we present an alternative approach and introduce a lightweight method that builds on pre-trained non-hierarchical clustering models. Remarkably, our approach outperforms specialized deep models for hierarchical clustering, and it is broadly applicable to any pre-trained clustering model that outputs logits, without requiring any fine-tuning. To highlight the generality of our approach, we extend its application to a supervised setting, demonstrating its ability to recover meaningful hierarchies from a pre-trained ImageNet classifier. Our results establish a practical and effective alternative to existing deep hierarchical clustering methods, with significant advantages in efficiency, scalability and performance.
Lay Summary: Many real-world problems involve organizing data into hierarchies. Think for instance of categorizing natural species at different taxonomy levels, such as species, genus, and family. Due to the wide spread of this problem, AI models designed specifically for categorizing unlabeled data in a hierarchy have been proposed. However, currently they do not perform well on large or complex datasets.
To solve this problem we took a step back and explored an alternative strategy. There are highly effective AI models that learn to discover meaningful groupings (or clusters) in unlabelled data - though they are not able to model a hierarchy of these clusters. Notably, these models can perform very well even on large or complex datasets. Hence we designed an efficient algorithm that takes the outputs of these models and organizes the clusters that are found into a meaningful hierarchy.
Surprisingly, this simple approach not only runs faster and scales better, but also consistently outperforms the more complex, specialized methods. Thereby we enable to discover meaningful hierarchies in datasets that are larger in size and complexity, compared to the ones that existing methods for this task could handle. Finally we show that our approach can be valuable also in setups where labels for the data are available.
Primary Area: General Machine Learning->Clustering
Keywords: Hierarchical Clustering, Clustering, Interpretability, Unsupervised Learning
Submission Number: 11839
Loading