Accelerated Gradient-Free Method for Heavily Constrained Nonconvex Optimization

Wanli Shi; Hongchang Gao; Bin Gu

Accelerated Gradient-Free Method for Heavily Constrained Nonconvex Optimization

Wanli Shi, Hongchang Gao, Bin Gu

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Constrained optimization, nonconvex, zeroth-order

Abstract: Zeroth-order (ZO) method has been shown to be a powerful method for solving the optimization problem where explicit expression of the gradients is difficult or infeasible to obtain. Recently, due to the practical value of the constrained problems, a lot of ZO Frank-Wolfe or projected ZO methods have been proposed. However, in many applications, we may have a very large number of nonconvex white/black-box constraints, which makes the existing zeroth-order methods extremely inefficient (or even not working) since they need to inquire function value of all the constraints and project the solution to the complicated feasible set. In this paper, to solve the nonconvex problem with a large number of white/black-box constraints, we proposed a doubly stochastic zeroth-order gradient method (DSZOG). Specifically, we reformulate the problem by using the penalty method with distribution probability and sample a mini-batch of constraints to calculate the stochastic zeroth/first-order gradient of the penalty function to update the parameters and distribution, alternately. To further speed up our method, we propose an accelerated doubly stochastic zeroth-order gradient method (ADSZOG) by using the exponential moving average method and adaptive stepsize. Theoretically, we prove DSZOG and ADSZOG can converge to the $\epsilon$-stationary point of the constrained problem. We also compare the performances of our method with several ZO methods in two applications, and the experimental results demonstrate the superiority of our method in terms of training time and accuracy.

One-sentence Summary: We propose two zeroth-order method to solve the problem with a large number of nonconvex/convex white/black-box constraints.

Supplementary Material: zip

6 Replies

Loading