Gibbs Sampling with Simulated Annealing K-Means for Mixture Regression

17 Sept 2025 (modified: 02 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Mixture model, Multivariate linear regression, K-means, Gibbs sampling, Simulated annealing.
TL;DR: To overcome the challenge of local optima in mixture regression, we propose a simulated annealing K-means algorithm and provide the first theoretical guarantees of its convergence to the global minimum, without requiring specific initializations.
Abstract: Fitting the Mixture of Multivariate Linear Regression models (MMLR) is a fundamental task in the analysis of heterogeneous data. Still, standard methods like the EM and K-means algorithms are hindered by their convergence to local optima and the NP-hard nature of the underlying optimization problem. To address this fundamental challenge, we propose Gibbs sampling with the simulated annealing K-means clustering algorithm. By synergizing the K-means framework with Gibbs sampling and a simulated annealing schedule, this approach is provably robust to initialization and avoids poor local minima. The primary contributions of this work are a comprehensive set of theoretical guarantees. First, we provide the first non-asymptotic guarantees on the algorithm's convergence to the global minimum of the Within-Cluster Sum of Squares (WCSS) objective, establishing explicit bounds on its rate and probability of convergence. Second, based on this global optimum, we establish a rigorous upper bound for the estimation error of the regression coefficients and a lower bound on classification accuracy in an asymptotic sense. Numerical experiments validate the superior performance of our method. This work presents a theoretically grounded and computationally practical framework for estimation and clustering in mixture regression models.
Supplementary Material: zip
Primary Area: learning theory
Submission Number: 9549
Loading