Statistical Inference for Deep Learning via Stochastic Modeling

Yan Sun; Faming Liang

Statistical Inference for Deep Learning via Stochastic Modeling

Yan Sun, Faming Liang

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: zip

Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Stochastic neural network, uncertainty quantification, nonlinear variable selection, stochastic gradient MCMC, imputation regularized-optimization

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We develop an innovative framework for performing statistical inference for deep learning models.

Abstract: Deep learning has revolutionized big data analysis in modern data science, however, how to make statistical inference for deep neural networks remains largely unclear. To this end, we explore a stochastic variant of the deep neural network known as the stochastic neural network (StoNet). Firstly, we show that the StoNet falls into the framework of statistical modeling. It not only enables us to address fundamental issues in deep learning, such as structure interpretability and uncertainty quantification, but also provides with us a platform for transferring the theory and methods developed for linear models to the realm of deep learning. Specifically, we show how the sparse learning theory with the Lasso penalty can be adapted to deep neural networks (DNNs) from linear models; establish that the sparse StoNet is consistent in network structure selection; and provides a recursive method to quantify the prediction uncertainty for the Stonet. Furthermore, we extend this result to the DNN by its asymptotic equivalence with the StoNet, showing that consistent sparse deep learning can be obtained by training a DNN with an appropriate Lasso penalty. Additionally, we propose to remodel the last hidden layer output and the target output of a well-trained DNN model using a StoNet on the validation dataset, and then assess the prediction uncertainty of the DNN model via the Stonet. The proposed method has been compared with conformal inference on extensive examples, and numerical results suggests its superiority.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2868

Loading