Improved Algorithms for Properly Learning Mixture of Gaussians

Xuan Wu, Changzhi Xie

Published: 01 Jan 2018, Last Modified: 12 May 2023NCTCS 2018Readers: Everyone

Abstract: We study the problem of learning Gaussian Mixture Model (GMM) in one dimension. Given samples access to a mixture f of k Gaussians and an accurate parameter $$\epsilon >0$$ , our algorithm takes $$\tilde{O}{(\frac{k}{\epsilon ^5})}$$ samples, runs in polynomial time and outputs a mixture g of at most $$\tilde{O}(\min \{\frac{k^2}{\epsilon ^2},\frac{k}{\epsilon ^3}\})$$ Gaussians such that the total variation distance between f and g is at most $$\epsilon $$ . This improves the previous result by [4], which uses $$O(\frac{k^2}{\epsilon ^6})$$ samples and outputs a mixture of $$O(\frac{k}{\epsilon ^3})$$ Gaussians. Our algorithm uses LP rounding technique to find the sparse solution of a linear programming. The main technical contribution of us is a non-trivial inequality for Gaussians, which may be interesting in its own right. We also consider the problem of properly learning mixture of two Gaussians. We show how to reduce the learning task to the closest pair problem in $$L_{\infty }$$ -norm. Our algorithm takes $$\tilde{O}(\frac{1}{\epsilon ^2})$$ samples and runs in time $$\tilde{O}(\frac{1}{\epsilon ^{4.001}})$$ . Our result improves the previous result by [7], which uses $$\tilde{O}(\frac{1}{\epsilon ^2})$$ samples and runs in time $$\tilde{O}(\frac{1}{\epsilon ^5})$$ .

0 Replies