Sharper Characterization of the Global Maximizers in Bilinear Programming with Applications to Asynchronous Gradient Descent
Keywords: Bilinear Programming, Optimization, Gradient Descent, Randomized Heuristics
Abstract: We study the bilinear program that arises when tuning the stepsizes in asynchronous gradient descent (AGD). Notably, we prove a necessity theorem: every global maximizer lies at an extreme point of the feasible region, strengthening the classical sufficiency guarantee for linear objectives on compact sets. Exploiting this structure, we recast the continuous problem as a discrete search over the vertices of the hyper‑cube and design a solver that performs a biased random walk among them. Over all the tested benchmarks, including the Cyclic Staircase benchmark, our solver reaches global optimality up to $1000\times$ faster than Gurobi 11 while using orders of magnitude fewer evaluations.
This structural result allows us to prove near-optimal stepsize scheme for the recently proposed Ringmaster AGD algorithm and a provable factor-$2$ approximation on the error to find an $\varepsilon$-stationary point. Together, our results provide both a sharper theoretical characterization and a practical solver for nonconvex bilinear programs emerging in distributed learning.
Primary Area: optimization
Submission Number: 19268
Loading