Revisiting Sparse Learning Methods: A Comprehensive Comparison of Best Subset Selection and LASSO

TMLR Paper4264 Authors

20 Feb 2025 (modified: 14 Mar 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Understanding the comparative performance of $L_0$ and $L_1$ models is crucial for developing accurate and efficient machine learning systems, particularly in noisy, real-world settings. The current understanding in the literature is that $L_1$-penalized linear models perform better than $L_0$ models as noise increases. However, prior studies have largely relied on small and synthetic datasets and limited comparisons between differing optimizers, leaving practical implications for diverse applications underexplored. We fill these gaps in analysis by testing multiple different $L_0$ and $L_1$ based optimizers on a larger variety of real datasets, and demonstrate that performance differences between $L_0$ and $L_1$ models depend significantly on the choice of optimizer and dataset characteristics. In many cases, the difference in performance by changing the optimization algorithm, while leaving the regularization penalty constant, is larger than the differences in changing the penalty. Additionally, we demonstrate cases where an $L_0$-penalized model can be both sparser and more accurate than the $L_1$-penalized variants. Together, our results show that even convex $L_1$ models can vary significantly in performance according to optimizer implementation, and that $L_0$ penalized models are more viable for many smaller real-world and noisy situations than previously recognized.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Anastasios_Kyrillidis2
Submission Number: 4264
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview