Sparse-Input Neural Network using Group Concave Regularization

TMLR Paper5013 Authors

02 Jun 2025 (modified: 20 Oct 2025)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Simultaneous feature selection and non-linear function estimation are challenging in modeling, especially in high-dimensional settings where the number of variables exceeds the available sample size. In this article, we investigate the problem of feature selection in neural networks. Although the group least absolute shrinkage and selection operator (LASSO) has been utilized to select variables for learning with neural networks, it tends to select unimportant variables into the model to compensate for its over-shrinkage. To overcome this limitation, we propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings. The main idea is to apply a proper concave penalty to the $l_2$ norm of weights from all outgoing connections of each input node, and thus obtain a neural net that only uses a small subset of the original variables. In addition, we develop an effective algorithm based on backward path-wise optimization to yield stable solution paths, in order to tackle the challenge of complex optimization landscapes. We provide a rigorous theoretical analysis of the proposed framework, establishing finite-sample guarantees for both variable selection consistency and prediction accuracy. These results are supported by extensive simulation studies and real data applications, which demonstrates the finite-sample performance of the estimator in feature selection and prediction across continuous, binary, and time-to-event outcomes.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Thank you for the opportunity to resubmit our revised manuscript, "Sparse-Input Neural Network using Group Concave Regularization" (ID: 5013). We are very grateful to the editor and the three expert reviewers for providing detailed and constructive feedback. We have addressed all of the comments and believe the manuscript has been substantially strengthened as a result. The most significant change is the addition of a comprehensive new theoretical section (Section 4), directly addressing a central concern raised by the reviewers. This new section provides a rigorous foundation for our proposed framework, substantially strengthening the paper and making it a more complete contribution. In addition to this major revision, we have also clarified key methodological points and improved the manuscript's presentation throughout. The key revisions include: • A New Theoretical Section: We have added a comprehensive theoretical framework (Section 4) that establishes non-asymptotic guarantees for our estimator's prediction accuracy, estimation error, and variable selection consistency. • Clarification of Methodological Contributions: We have revised the manuscript (primarily Section 3) to clarify our contribution regarding the optimization strategy, focusing on the generation of "stable solution paths" and distinguishing it from the properties of the underlying optimizer. • Expanded Discussion of Literature: The Introduction now includes a detailed discussion of recent theoretical work by Sun et al. (2021) and Lederer (2022), highlighting the unique contributions of our paper. • Reinforced Explanations and Justifications: We have reinforced the explanation for the critical role of the ridge penalty (α), clarified the rationale for our choice of baseline methods, and explicitly stated the objectives of our real-data examples to better frame their contributions. We will provide a point-by-point response to each reviewer's comments, detailing these changes. We are confident that the revisions have addressed all concerns and have significantly improved the paper. We hope you will now find the manuscript suitable for publication.
Assigned Action Editor: ~Bryon_Aragam1
Submission Number: 5013
Loading