Gradient-based training of Gaussian Mixture Models in High-Dimensional SpacesDownload PDF

25 Sep 2019 (modified: 24 Dec 2019)ICLR 2020 Conference Blind SubmissionReaders: Everyone
  • Original Pdf: pdf
  • TL;DR: We present a stochastic gradient descent algorithm for training GMMs in high-dimensional spaces, which performs similarly to the traditional EM procedure but which is much more memory-efficient.
  • Abstract: We present an approach for efficiently training Gaussian Mixture Models (GMMs) with Stochastic Gradient Descent (SGD) on large amounts of high-dimensional data (e.g., images). In such a scenario, SGD is strongly superior in terms of execution time and memory usage, although it is conceptually more complex than the traditional Expectation-Maximization (EM) algorithm. For enabling SGD training, we propose three novel ideas: First, we show that minimizing an upper bound to the GMM log likelihood instead of the full one is feasible and numerically much more stable way in high-dimensional spaces. Secondly, we propose a new regularizer that prevents SGD from converging to pathological local minima. And lastly, we present a simple method for enforcing the constraints inherent to GMM training when using SGD. We also propose an SGD-compatible simplification to the full GMM model based on local principal directions, which avoids excessive memory use in high-dimensional spaces due to quadratic growth of covariance matrices. Experiments on several standard image datasets show the validity of our approach, and we provide a publicly available TensorFlow implementation.
  • Code:
  • Keywords: GMM, SGD
8 Replies