Abstract: In this paper, we present new techniques for building and analyzing robust stochastic optimization algorithms. To solve the given $d$-dimensional optimization problem, our technique generates a sequence of random $k$-dimensional subproblems, where $k<d$, and solves them instead. Unlike traditional optimization analysis which exploits structural assumptions like convexity, Lipschitzness or Polyak-Lojasiewicz criterion of the loss function to obtain convergence rates, our analysis only uses the geometrical structure of the randomness used in the algorithm. This offers a wider applicability than traditional methods, and indeed it applies to all smooth loss functions. Moreover, our analysis identifies an important parameter of the minimizers of the loss function, which we call the gap parameter. This parameter dictates the convergence rates of our algorithm. We experimentally study the algorithm on linear regression, logistic regression, SVMs, and neural networks. Using these experiments, we argue that the gap parameter of a minimizer also controls its robustness to the presence of noise in the training data (popularly referred to as data poisoning). A modified algorithm which can control the effect of noise on its output is presented as well. Finally, we discuss how the choice of $k$ affects the convergence and robustness of our algorithm.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Eduard_Gorbunov1
Submission Number: 3076
Loading