An Adaptive Half-Space Projection Method for Stochastic Optimization Problems with Group Sparse Regularization
Abstract: Optimization problems with group sparse regularization are ubiquitous in various popular downstream applications, such as feature selection and compression for Deep Neural Networks (DNNs). Nonetheless, the existing methods in the literature do not perform particularly well when such regularization is used in combination with a stochastic loss function. In particular, it is challenging to design a computationally efficient algorithm with a convergence guarantee and can compute group-sparse solutions. Recently, a half-space stochastic projected gradient (HSPG) method was proposed that partly addressed these challenges. This paper presents a substantially enhanced version of HSPG that we call AdaHSPG+ that makes two noticeable advances. First, AdaHSPG+ is shown to have a stronger convergence result under significantly looser assumptions than those required by HSPG. This improvement in convergence is achieved by integrating variance reduction techniques with a new adaptive strategy for iteratively predicting the support of a solution. Second, AdaHSPG+ requires significantly less parameter tuning compared to HSPG, thus making it more practical and user-friendly. This advance is achieved by designing automatic and adaptive strategies for choosing the type of step employed at each iteration and for updating key hyperparameters. The numerical effectiveness of our proposed AdaHSPG+ algorithm is demonstrated on both convex and non-convex benchmark problems. The source code is available at https://github.com/tianyic/adahspg.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Supplementary Material: zip
Assigned Action Editor: ~Robert_M._Gower1
Submission Number: 837