Confidence Intervals and Simultaneous Confidence Bands Based on Deep Learning

Asaf Ben Arie; Malka Gorfine

Confidence Intervals and Simultaneous Confidence Bands Based on Deep Learning

Asaf Ben Arie, Malka Gorfine

Published: 24 Nov 2024, Last Modified: 06 Jan 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Deep learning models have significantly improved prediction accuracy in various fields, gaining recognition across numerous disciplines. Yet, an aspect of deep learning that remains insufficiently addressed is the assessment of prediction uncertainty. Producing reliable uncertainty estimators could be crucial in practical terms. For instance, predictions associated with a high degree of uncertainty could be sent for further evaluation. Recent works in uncertainty quantification of deep learning predictions, including Bayesian posterior credible intervals and a frequentist confidence-interval estimation, have proven to yield either invalid or overly conservative intervals. Furthermore, there is currently no method for quantifying uncertainty that can accommodate deep neural networks for survival (time-to-event) data that involves right-censored outcomes. In this work, we provide a non-parametric bootstrap method that disentangles data uncertainty from the noise inherent in the adopted optimization algorithm. %, ensuring that based on deep learning estimators with small bias, the resulting point-wise confidence intervals or the simultaneous confidence bands are accurate (i.e., valid and not overly conservative). The validity of the proposed approach is demonstrated through an extensive simulation study, which shows that the method is accurate (i.e., valid and not overly conservative) as long as the network is sufficiently deep to ensure that the estimators provided by the deep neural network exhibit minimal bias. Otherwise, undercoverage of up to 8\% is observed. The proposed ad-hoc method can be easily integrated into any deep neural network without interfering with the training process. The utility of the proposed approach is demonstrated through two applications: constructing simultaneous confidence bands for survival curves generated by deep neural networks dealing with right-censored survival data, and constructing a confidence interval for classification probabilities in the context of binary classification regression. Code for the data analysis and reported simulation is available at Githubsite: \url{https://github.com/Asafba123/Survival_bootstrap}.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: Dear Prof. Andreas Kirsch, We sincerely apologize for not adequately addressing your requests. I hope you will find that the current version incorporates all the required changes effectively. However, we remain open and willing to make any additional adjustments if necessary. Please find attached a revised version of the paper, which includes the following changes: 1) Abstract: The current wording is "In this work, we provide a non-parametric bootstrap method that disentangles data uncertainty from the noise inherent in the adopted optimization algorithm. The validity of the proposed approach is demonstrated through an extensive simulation study, which shows that the method is accurate (i.e., valid and not overly conservative) as long as the network is sufficiently deep to ensure that the estimators provided by the deep neural network exhibit minimal bias. Otherwise, undercoverage of up to 8% is observed." 2) Section 1.3 Contributions: Instead of "Unlike existing approaches, our approach offers (1) valid point-wise confidence intervals that are not overly conservatives, as long as the estimators provided by the DNN have only a small bias, (2) ... " The current text is: "Unlike existing methods, our approach provides: (1) Valid point-wise confidence intervals that are not overly conservative, provided the estimators generated by the DNN exhibit minimal bias. Otherwise, undercoverage may occur. (2) ... " 3) Section 6 Theoretical Aspects: Instead of "The following discussion is divided into two parts: (i) A justification of the proposed bootstrap method for estimating the uncertainty of $\widehat{\mu}_n$ ..." The current text is "The following discussion is divided into two parts: (i) The motivation of the proposed bootstrap method for estimating the uncertainty of $\widehat{\mu}_n$ ..." 4) Section 7 Concluding Remarks: 4.A) Instead of "Our comprehensive simulation study indicates that the proposed method is valid and not overly conservative, as long as the estimators provided by the DNN have only a small bias. Often, ..." The current text is "Our comprehensive simulation study indicates that the proposed method is valid and not overly conservative, as long as the estimators provided by the DNN have only a small bias. Otherwise, undercoverage of up to 8\% is observed. Often, ..." 4.B) Instead of "Future work should focus on enhancing the proposed approach to reduce computational burden and conducting a rigorous theoretical analysis to verify the asymptotic accuracy of the coverage probability under various conditions." The current text is: "Future research should aim to refine the proposed approach to reduce its computational complexity and perform a comprehensive theoretical analysis to rigorously establish the asymptotic accuracy of the coverage probability across diverse conditions."

Code: https://github.com/Asafba123/Survival_bootstrap

Assigned Action Editor: ~Andreas_Kirsch1

Submission Number: 2914

Loading