Provably Faster Algorithms for Bilevel Optimization

Junjie Yang; Kaiyi Ji; Yingbin Liang

Provably Faster Algorithms for Bilevel Optimization

Junjie Yang, Kaiyi Ji, Yingbin Liang

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 SpotlightReaders: Everyone

Keywords: Bilevel Optimization, Momentum, Recursive Gradient Estimator, Hessian Vector Computation

TL;DR: This paper proposes two bilevel optimizers that provably outperform all existing algorithms by the order of magnitude.

Abstract: Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning. Recently, several momentum-based algorithms have been proposed to solve bilevel optimization problems faster. However, those momentum-based algorithms do not achieve provably better computational complexity than $\mathcal{\widetilde O}(\epsilon^{-2})$ of the SGD-based algorithm. In this paper, we propose two new algorithms for bilevel optimization, where the first algorithm adopts momentum-based recursive iterations, and the second algorithm adopts recursive gradient estimations in nested loops to decrease the variance. We show that both algorithms achieve the complexity of $\mathcal{\widetilde O}(\epsilon^{-1.5})$, which outperforms all existing algorithms by the order of magnitude. Our experiments validate our theoretical results and demonstrate the superior empirical performance of our algorithms in hyperparameter applications.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

Code: https://github.com/JunjieYang97/MRVRBO

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/provably-faster-algorithms-for-bilevel/code)

12 Replies

Loading