Efficient Testing for Correlation Clustering: Improved Algorithms and Optimal Bounds

Efficient Testing for Correlation Clustering: Improved Algorithms and Optimal Bounds

ICLR 2026 Conference Submission21003 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Correlation Clustering, Structural Balance, Property Testing

Abstract: Correlation clustering is an important unsupervised learning problem with broad applications. In this problem, we are given a labeled complete graph $G=(V,E^+ \cup E^-)$, and the optimal clustering is defined as a partition of the vertices that minimizes the $+$ edges between clusters and $-$ edges within clusters. We investigate efficient algorithms to test the \emph{cost} of correlation clustering: here, we want to know whether the graph could be (nearly) perfectly clustered (with $0$ cost) or is far away from admitting any perfect clustering. The problem has attracted significant attention aimed at modern large-scale applications, and the state-of-the-art results use $\widetilde{O}({1}/{\varepsilon^7})$ queries and time (up to log factors) to decide whether a graph is perfectly clusterable or needs to flip labels of $\varepsilon {\binom n 2}$ edges to become clusterable. In this paper, we improve this bound significantly by designing an algorithm that uses ${O}({1}/{\varepsilon^2})$ queries and time. Furthermore, we derive the first algorithm that tests the cost for the special setting of correlation clustering with $k$ clusters with ${O}(1/{\varepsilon^4})$ queries and time for constant $k$. Finally, for the special case of $k=2$, which corresponds to the strong structure balance problem in social networks, we obtain tight bounds of $\Theta({1}/{\varepsilon})$ queries -- the first set of \emph{tight} bounds in these problems. We conduct experiments on simulated and real-world datasets, and empirical results demonstrate the advantages of our algorithms.

Primary Area: learning theory

Submission Number: 21003

Loading