DTZO: Distributed Trilevel Zeroth Order Learning with Provable Non-Asymptotic Convergence

Published: 06 Jun 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Trilevel learning (TLL) with zeroth order constraints is a fundamental problem in machine learning, arising in scenarios where gradient information is inaccessible due to data privacy or model opacity, such as in federated learning, healthcare, and financial systems. These problems are notoriously difficult to solve due to their inherent complexity and the lack of first order information. Moreover, in many practical scenarios, data may be distributed across various nodes, necessitating strategies to address trilevel learning problems without centralizing data on servers to uphold data privacy. To this end, an effective distributed trilevel zeroth order learning framework DTZO is proposed in this work to address the trilevel learning problems with level-wise zeroth order constraints in a distributed manner. The proposed DTZO is versatile and can be adapted to a wide range of (grey-box) trilevel learning problems with partial zeroth order constraints. In DTZO, the cascaded polynomial approximation can be constructed without relying on gradients or sub-gradients, leveraging a novel cut, i.e., zeroth order cut. Furthermore, we theoretically carry out the non-asymptotic convergence rate analysis for the proposed DTZO in achieving the $\epsilon$-stationary point. Extensive experiments have been conducted to demonstrate and validate the superior performance of the proposed DTZO.
Lay Summary: (1) Nested optimization has attracted significant attention in Machine Learning, with applications in areas such as meta-learning, adversarial learning, hyperparameter optimization, and continual learning. Solving nested optimization problems without relying on gradient information has become increasingly important, especially with the rise of LLMs, where commercial LLM APIs often do not expose gradients. (2) Tackling nested optimization problems without gradient information is highly challenging. We propose the first framework, DTZO, to address three-level nested optimization problems in a zeroth-order manner, and we provide theoretical guarantees for the proposed trilevel zeroth-order algorithm. (3) This helps bridge the gap between nested optimization and zeroth-order methods, making trilevel learning more widely applicable and filling an important theoretical gap.
Primary Area: General Machine Learning->Scalable Algorithms
Keywords: Distributed Algorithm, Trilevel Optimization, Zeroth-Order Optimization
Submission Number: 8431
Loading