Learning a Single Index Model from Anisotropic Data with Vanilla Stochastic Gradient Descent

Published: 22 Jan 2025, Last Modified: 10 Mar 2025AISTATS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We investigate the problem of learning a Single Index Model (SIM) from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient Descent (SGD). Our analysis shows that, unlike spherical SGD -- which is commonly used for theoretical analysis and requires estimating the covariance matrix $Q \in \mathbb{R}^{d \times d}$ -- vanilla SGDcan naturally adapt to the covariance structure of the data without additional modifications. Our key theoretical contribution is a dimension-free upper bound on the sample complexity, which depends on $Q$, its alignment with the single index $w^*$, and the information exponent $k^*$. We complement this upper bound with a Correlated Statistical Query (CSQ) lower bound that matches the upper bound on average over $w^*$, although it is suboptimal in $k^*$. Finally, we validate and extend our theoretical findings through numerical simulations, demonstrating the practical effectiveness of vanilla SGD in this context.
Submission Number: 426
Loading