Assigning Confidence: K-partition Ensembles

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: ensemble learning, unsupervised learning, clustering
TL;DR: We propose CAKE, a method that estimates per-point confidence in clustering using ensemble agreement and geometric consistency, enabling robust, interpretable, and unsupervised assessment of cluster reliability.
Abstract: Clustering is widely used for unsupervised structure discovery, but it offers no clear measure of how reliable each individual assignment is. While convergence or objective scores may reflect global quality, they do not indicate whether specific points are stably assigned, particularly in algorithms like $k$-means, which are sensitive to initialization and noise. This assignment-level instability can undermine clustering accuracy and robustness. Ensemble methods improve global consistency by aggregating multiple runs, but they typically lack tools for quantifying pointwise confidence. We introduce CAKE (Confidence in Assignments via K-partition Ensembles), a unified framework that evaluates each point using two complementary statistics, assignment stability and consistency of local geometric fit, measured across a clustering ensemble. These are combined into a single interpretable confidence score in $[0,1]$. Theoretical analysis shows that CAKE scores are robust to noise and reliably distinguish stable from unstable points. Empirical results on synthetic and real-world datasets demonstrate that CAKE identifies both high- and low-confidence assignments, enabling targeted filtering or prioritization that improves clustering quality.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 9546
Loading