TL;DR: You can learn gaussians using strictly fewer samples than needed in the worst case, if you start with a good enough guess for the parameters.
Abstract: We revisit the problem of distribution learning within the framework of learning-augmented algorithms.
In this setting, we explore the scenario where a probability distribution is provided as potentially inaccurate advice on the true, unknown distribution. Our objective is to develop learning algorithms whose sample complexity decreases as the quality of the advice improves, thereby surpassing standard learning lower bounds when the advice is sufficiently accurate. Specifically, we demonstrate that this outcome is achievable for the problem of learning a multivariate Gaussian distribution $N(\mu, \Sigma)$ in the PAC learning setting. Classically, in the advice-free setting, $\widetilde{\Theta}(d^2/\varepsilon^2)$ samples are sufficient and worst case necessary to learn $d$-dimensional Gaussians up to TV distance $\varepsilon$ with constant probability. When we are additionally given a parameter $\widetilde{\Sigma}$ as advice, we show that $\widetilde{\mathcal{O}}(d^{2-\beta}/\varepsilon^2)$ samples suffices whenever $|| \widetilde{\Sigma}^{-1/2} \Sigma \widetilde{\Sigma}^{-1/2} - I_d ||_1 \leq \varepsilon d^{1-\beta}$ (where $||\cdot||_1$ denotes the entrywise $\ell_1$ norm) for any $\beta > 0$, yielding a polynomial improvement over the advice-free setting.
Lay Summary: Estimating the mean and covariance of a multivariate Gaussian distribution is a well-known problem in machine learning. In the worst case, it requires a number of samples that grows quadratically with the number of variates/features. We study a new setting where, in addition to data samples, we are given imperfect advice in the form of predictions/guesses for the mean and covariance. These predictions may come from prior models or expert knowledge, but we have no guarantees about their accuracy.
We design an algorithm that first tests whether the advice is reliable. If it is, we use it to reduce the number of samples needed, applying tools from convex optimization. If it isn’t, we default to standard estimators. Our method is always correct and provably uses fewer samples when the advice is good. We also show that the trade-off between advice quality and sample efficiency is close to the best possible.
Link To Code: https://github.com/philips-george/gaussian-learning-with-advice
Primary Area: Theory->Probabilistic Methods
Keywords: learning-augmented algorithms, multivariate gaussian learning, sample complexity
Submission Number: 550
Loading