Sharper Characterization of the Global Maximizers in Bilinear Programming with Applications to Asynchronous Gradient Descent

Sharper Characterization of the Global Maximizers in Bilinear Programming with Applications to Asynchronous Gradient Descent

ICLR 2026 Conference Submission19268 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Bilinear Programming, Optimization, Gradient Descent, Randomized Heuristics

Abstract: We study the bilinear program that arises when tuning the stepsizes in asynchronous gradient descent (AGD). Notably, we prove a necessity theorem: every global maximizer lies at an extreme point of the feasible region, strengthening the classical sufficiency guarantee for linear objectives on compact sets. Exploiting this structure, we recast the continuous problem as a discrete search over the vertices of the hyper‑cube and design a solver that performs a biased random walk among them. Over all the tested benchmarks, including the Cyclic Staircase benchmark, our solver reaches global optimality up to $1000\times$ faster than Gurobi 11 while using orders of magnitude fewer evaluations. This structural result allows us to prove near-optimal stepsize scheme for the recently proposed Ringmaster AGD algorithm and a provable factor-$2$ approximation on the error to find an $\varepsilon$-stationary point. Together, our results provide both a sharper theoretical characterization and a practical solver for nonconvex bilinear programs emerging in distributed learning.

Primary Area: optimization

Submission Number: 19268

Loading