Keywords: bilevel optimization, optimization layer, discrete optimization, hybrid architectures, proximal gradient descent, mirror descent
TL;DR: We propose Lagrangian Proximal Gradient Descent, a flexible framework for learning general convex optimization models even when informative gradient information is not available.
Abstract: We propose Lagrangian Proximal Gradient Descent (LPGD), a flexible framework for learning convex optimization models.
Like traditional proximal gradient methods, LPGD can be interpreted as optimizing a smoothed envelope of the possibly non-differentiable loss. The smoothening allows training models that do not provide informative gradients, such as discrete optimization models.
We show that the LPGD update can efficiently be computed by rerunning the forward solver on a perturbed input.
Moreover, we prove that the LPGD update converges to the gradient as the smoothening parameter approaches zero.
Finally, we experimentally investigate the potential benefits of applying LPGD even in a fully differentiable setting.
Submission Number: 40
Loading