A Stochastic Polynomial Expansion for Uncertainty Propagation through Networks

TMLR Paper4241 Authors

18 Feb 2025 (modified: 20 Feb 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Network-based machine learning constructs are becoming more prevalent in sensing and decision-making systems. As these systems are implemented in safety-critical environments such as pedestrian detection and power management, it is crucial to evaluate confidence in their decisions. At the heart of this problem is a need to understand and characterize how errors at the input of networks become progressively expanded or contracted as signals move through layers, especially in light of the non-trivial nonlinearities manifest throughout modern machine learning architectures. When sampling methods become expensive due to network size or complexity, approximation is needed and popular methods include Jacobian (first order Taylor) linearization and stochastic linearization. However, despite computational tractability, the accuracy of these methods can break down in situations with moderate to high input uncertainty. Here, we present a generalized method of propagating variational multivariate Gaussian distributions through neural networks. We propose a modified Taylor expansion function for nonlinear transformation of Gaussian distributions, with an additional approximation in which the polynomial terms act on independent Gaussian random variables (which are identically distributed). With these approximated higher order terms (HOTs), we obtain significantly more accurate estimation of layer-wise distributions. Despite the introduction of the HOTs, this method can propagate a full covariance matrix with a complexity of $\boldsymbol{O}(n^2)$ (and $\boldsymbol{O}(n)$ if only propagating marginal variance), comparable to Jacobian linearization. Thus, our method finds a balance between efficiency and accuracy. We derived the closed form solutions for this approximate Stochastic Taylor expansion for seven commonly used nonlinearities and verified the effectiveness of our method in deep residual neural networks, Bayesian neural networks, and variational autoencoders. This general method can be integrated into use-cases such as Kalman filtering, adversarial training, and variational learning.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=2NHRnF859N&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)
Changes Since Last Submission: Rewrote introduction with more literature review. Reformulated equations for clarity. Refined the presentation of experimental results by replacing tables with figures. Added two experiments on real-world datasets. Added one dedicated section to address the difference to cubature Kalman filtering.
Assigned Action Editor: ~Yingzhen_Li1
Submission Number: 4241
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview