Gradient-based training of Gaussian Mixture Models for High-Dimensional Streaming Data

Alexander Gepperth; Benedikt Pfülb

Gradient-based training of Gaussian Mixture Models for High-Dimensional Streaming Data

Alexander Gepperth, Benedikt Pfülb

28 Sept 2020 (modified: 26 May 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Gaussian Mixture Models, Stochastic Gradient Descent, Unsupervised Representation Learning, Continual Learning

Abstract: We present an approach for efficiently training Gaussian Mixture Models by SGD on non-stationary, high-dimensional streaming data. Our training scheme does not require data-driven parameter initialization (e.g., k-means) and has the ability to process high-dimensional samples without numerical problems. Furthermore, the approach allows mini-batch sizes as low as 1, typical for streaming-data settings, and it is possible to react and adapt to changes in data statistics (concept drift/shift) without catastrophic forgetting. Major problems in such streaming-data settings are undesirable local optima during early training phases and numerical instabilities due to high data dimensionalities.%, and catastrophic forgetting when encountering concept drift. We introduce an adaptive annealing procedure to address the first problem,%, which additionally plays a decisive role in controlling the \acp{GMM}' reaction to concept drift. whereas numerical instabilities are eliminated by using an exponential-free approximation to the standard \ac{GMM} log-likelihood. Experiments on a variety of visual and non-visual benchmarks show that our SGD approach can be trained completely without, for instance, k-means based centroid initialization, and compares favorably to sEM, an online variant of EM.

One-sentence Summary: We present a method to train Gaussian Mixture Models by SGD, which requires no prior k-means initialization as EM does, and is thus feasible for streaming data.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/gradient-based-training-of-gaussian-mixture/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=Me_LVQeIh

11 Replies

Loading