Balanced Clustering in Reduced Dimensions

20 Sept 2025 (modified: 02 Feb 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Clustering, Manifold Learning, Dimensionality Reduction
Abstract: High-dimensional data often require embedding into lower-dimensional spaces to preserve essential features and structures, which is critical for large-scale analysis. Existing approaches typically treat embedding and clustering as joint optimization tasks but fail to integrate them within a unified framework, limiting clustering performance. Moreover, the interplay between labels and manifold structure is frequently overlooked. To address these challenges, we propose a low-dimensional manifold clustering method that integrates K-means with manifold learning. To mitigate inaccuracies in initial cluster labels, we introduce neighborhood constraints that promote intra-class compactness and inter-class separation, thereby improving label reliability. These refined labels are then used to construct a manifold representation, which in turn enhances clustering in a self-supervised loop that enforces consistency between structure and labels. Notably, we show that maximizing the Schatten-p norm naturally preserves class balance, and we provide a theoretical justification for this property. Extensive experiments on multiple datasets demonstrate the effectiveness and robustness of our approach.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 23298
Loading