On Learning Parallel Pancakes with Mostly Uniform Weights

Ilias Diakonikolas; Daniel Kane; Sushrut Karmalkar; Jasper C.H. Lee; Thanasis Pittas

On Learning Parallel Pancakes with Mostly Uniform Weights

Ilias Diakonikolas, Daniel Kane, Sushrut Karmalkar, Jasper C.H. Lee, Thanasis Pittas

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We study the complexity of learning k-mixtures of Gaussians with shared covariance and non-trivial weights, establishing a lower bound matching existing algorithms and extending testing algorithms to handle more general weight distributions.

Abstract: We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb R^d$. This task is known to have complexity $d^{\Omega(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponentially small and that the components have the same unknown covariance. Recent work gave a $d^{O(\log(1/w_{\min}))}$-time algorithm for this class of GMMs, where $w_{\min}$ is the minimum weight. Our first main result is a Statistical Query (SQ) lower bound showing that this quasi-polynomial upper bound is essentially best possible, even for the special case of uniform weights. Specifically, we show that it is SQ-hard to distinguish between such a mixture and the standard Gaussian. We further explore how the distribution of weights affects the complexity of this task. Our second main result is a quasi-polynomial upper bound for the aforementioned testing task when most of the weights are uniform while a small fraction of the weights are potentially arbitrary.

Lay Summary: Gaussian mixtures are often used to describe data that comes from several overlapping groups. These models are widely used in fields like image processing and genetics to make sense of complex data. However, learning the underlying structure, especially when the number of groups grows, quickly becomes very hard, often taking an impractical amount of computation. To make progress, past works have studied simpler versions of the problem. For example, they might assume that all groups are shaped similarly and that no group is too small in size. Recent work showed that, under these assumptions, the problem can be solved faster, though still not efficiently enough for very large datasets. A natural question is whether this can be further. Our research shows that even when all the groups are equally sized, the problem remains hard, and the above complexity is unavoidable. We then investigate what happens when most of the groups are equally sized but a small number are allowed to be much smaller. In this case, we show that there is an algorithm whose complexity lies between the two extremes: the case where all groups are equally sized and the case where group sizes are completely arbitrary.

Primary Area: Theory->Learning Theory

Keywords: gaussian mixture models, parallel pancakes, statistical query, high-dimensional statistics

Submission Number: 14052

Loading