BaSIS-Net: From Point Estimate to Predictive Distribution in Neural Networks - A Bayesian Sequential Importance Sampling Framework

Published: 02 Jul 2024, Last Modified: 10 Jul 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Data-driven Deep Learning (DL) models have revolutionized autonomous systems, but ensuring their safety and reliability necessitates the assessment of predictive confidence or uncertainty. Bayesian DL provides a principled approach to quantify uncertainty via probability density functions defined over model parameters. However, the exact solution is intractable for most DL models, and the approximation methods, often based on heuristics, suffer from scalability issues and stringent distribution assumptions and may lack theoretical guarantees. This work develops a Sequential Importance Sampling framework that approximates the posterior probability density function through weighted samples (or particles), which can be used to find the mean, variance, or higher-order moments of the posterior distribution. We demonstrate that propagating particles, which capture information about the higher-order moments, through the layers of the DL model results in increased robustness to natural and malicious noise (adversarial attacks). The variance computed from these particles effectively quantifies the model’s decision uncertainty, demonstrating well-calibrated and accurate predictive confidence.
Submission Length: Long submission (more than 12 pages of main content)
Supplementary Material: pdf
Previous TMLR Submission Url:
Changes Since Last Submission: We included the following in the discussion about the selection of sigma " ...Additionally, we investigated the robustness and uncertainty calibration of the model with varying \sigma_{\eta}. Figure 11 shows test accuracy and predictive variance under Gaussian noise applied to MNIST test data, with several BaSIS-Net models using different \sigma_{\eta} values.” In the Appendix of the revised manuscript, we have included a new figure (Fig. 11) and the following discussion and results: “We evaluated the impact of the transition noise parameter \sigma_{\eta} on the robustness of BaSIS-Net to noisy conditions and assessed the predictive uncertainty of the resulting models. Our findings indicate that smaller \sigma_{\eta} values are associated with higher accuracy on clean noiseless data. However, at higher noise levels, models with larger \sigma_{\eta} exhibit more robust performance. Additionally, for low \sigma_{\eta} values, the uncertainty estimates are not as informative as in models trained with larger values. In Fig. 11, we illustrate how test accuracy and variance information vary with different \sigma_{\eta} selections under noisy conditions applied to the MNIST test data. We observe that models trained with a small \sigma_{\eta} are less robust at low SNR, and their variance is unable to distinguish between correctly and incorrectly classified inputs. This undermines one of the key features of our BaSIS-Net framework, which is the ability to provide meaningful uncertainty estimates that differentiate between correct and incorrect classifications.”
Assigned Action Editor: ~Simon_Lacoste-Julien1
Submission Number: 1742