An Information-theoretic Perspective of Hierarchical Clustering on Graphs

Published: 07 May 2025, Last Modified: 13 Jun 2025UAI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: hierarchical clustering, structural entropy, approximation algorithm
TL;DR: We investigate the graph hierarchical clustering problem on graphs from an information-theoretic perspective.
Abstract: The seminal work of \citep{dasgupta2016cost} has introduced a combinatorial cost function for hierarchical graph clustering that has inspired numerous follow-up studies adopting similar combinatorial approaches. In this paper, we investigate this problem from the \emph{information-theoretic} perspective. We formulate a new cost function that is fully explainable and establish the relationship between combinatorial and information-theoretic perspectives. We present two algorithms for expander-like and well-clustered cardinality weighted graphs, respectively, and show that both of them achieve $O(1)$-approximation for our new cost function. Addressing practical needs, we consider non-binary hierarchical clustering problem, and propose a hyperparameter-free framework HCSE that recursively stratifies cluster trees through sparsity-aware partitioning, automatically determining the optimal hierarchy depth via an interpretable mechanism. Extensive experimental results demonstrate the superiority of our cost function and algorithms in binary clustering performance, hierarchy level identification, and reconstruction accuracy compared to existing approaches.
Supplementary Material: zip
Latex Source Code: zip
Code Link: https://github.com/Hardict/HCSE
Signed PMLR Licence Agreement: pdf
Readers: auai.org/UAI/2025/Conference, auai.org/UAI/2025/Conference/Area_Chairs, auai.org/UAI/2025/Conference/Reviewers, auai.org/UAI/2025/Conference/Submission398/Authors, auai.org/UAI/2025/Conference/Submission398/Reproducibility_Reviewers
Submission Number: 398
Loading