Keywords: discrete optimization, derivative-free optimization, black-box function, regret bounds, subspace learning, neural networks, surrogate modeling
Abstract: Applications such as engineering design and hyperparameter tuning often require us to optimize a black-box function, i.e., a system whose inner processing is not analytically known and whose gradients are not available. Practitioners often have a fixed budget for the number of function evaluations and the performance of an optimization algorithm is measured by its simple regret. In this paper, we study the class of ``Optimistic Optimization'' algorithms for black-box optimization that use a partitioning scheme for the domain. We develop algorithms that learn a good partitioning scheme and use flexible surrogate models such as neural networks in the optimization procedure. For multi-index functions on an $m$-dimensional subspace within $d$ dimensions, our algorithm attains $\tilde{O}(n^{-\beta / d})$ regret, where $\beta = 1 + \frac{d-m}{2m-1}$, as opposed to $\tilde{O}(n^{-1/d})$ for SequOOL, a state-of-the-art optimistic optimization algorithm. In numerical experiments on benchmark functions, our algorithm converged using $21\%$ to $36\%$ fewer evaluations compared to SequOOL. Our approach improves the quality of activation aware quantization of the OPT-1.3B large language model.
Submission Number: 102
Loading