Abstract: Residual neural networks (ResNets) have become widely used as they allow for smooth and efficient training of deep neural network architectures. However, when trained on small, noisy and high-dimensional data, ResNets may suffer from overfitting due to the large amount of parameters. As a solution, a range of regularization methods have been proposed. One promising approach relies on the proximal mapping technique which is computationally efficient since it can be directly incorporated into the optimization algorithm. However, the performance of ResNets with various convex or non-convex proximal regularizers remains under-explored on high-dimensional data. In this study, we propose an extended stochastic adaptive proximal gradient ResNet method that can handle both convex and non-convex regularizers that range from $L_0$ to $L_{\infty}$. Moreover, we evaluate the prediction performance in a supervised regression setting on four real high-dimensional genomic datasets from mice, pig, wheat and loblolly pine. For comparison, we also implement and evaluate traditional sparse linear proximal methods with the same regularizers, as well as LightGBM. Experimental results demonstrate that an 18-layer ResNet with $L_{\frac{1}{2}}$ regularization outperforms other configurations on both mice and pig datasets. For the wheat and loblolly pine data, the 15-layer ResNet $L_{\frac{1}{2}}$ configuration achieves the lowest test mean squared errors. These findings highlight the effectiveness of the regularized adaptive proximal gradient ResNet method and its potential for prediction tasks on high-dimensional genomic data.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=mYoE23a3bc
Changes Since Last Submission: The paper has been revised following suggestions from the reviewers.
Assigned Action Editor: ~Mathurin_Massias1
Submission Number: 5294
Loading