Optimistic Optimization of Gaussian Process Samples
Abstract: Bayesian optimization is a popular formalism for global optimization, but its computational costs limit it to expensive-to-evaluate functions. A competing, computationally more effi- cient, global optimization framework is optimistic optimization, which exploits prior knowl- edge about the geometry of the search space in form of a dissimilarity function. We investi- gate to which degree the conceptual advantages of Bayesian Optimization can be combined with the computational efficiency of optimistic optimization. By mapping the kernel to a dissimilarity, we obtain an optimistic optimization algorithm for the Bayesian Optimization setting with a run-time of up to $O(N log N )$. As a high-level take-away we find that, when using stationary kernels on objectives of low evaluation cost, optimistic optimization can be preferable over Bayesian optimization, while for strongly coupled and parametric models, Bayesian optimization can perform much better, even at low evaluation cost. As a concep- tual takeaway, our results demonstrate that balancing exploration and exploitation under Gaussian process assumptions does not require computing a posterior.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: * added DiRect as a baseline (Figure 3, Figure 6 etc., Appendix A.3) * corrected a mistake regarding the asymtotic runtime in high dimensions (which made it better) * same budget for all scalable optimization methods in plots * better readable colors for plots * appendix: failed experiments with another baseline (AdaBkb) * minor edits: typos etc. as pointed out in the reviews
Supplementary Material: zip
Assigned Action Editor: ~Cedric_Archambeau1
Submission Number: 1181