TOWARD RELIABLE NEURAL SPECIFICATIONSDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: formal verification, specification, neural network verfication, trustworthy AI, interpretability
TL;DR: We propose a new family of specifications based on neural activation patterns and evaluate its effectiveness through both statistical analysis and formal verification.
Abstract: Having reliable specifications is an unavoidable challenge in achieving verifiable correctness, robustness, and interpretability of AI systems. Existing specifications for neural networks are in the flavor of “data as specification”, that is, the local neighborhood centering around a reference input is considered to be correct (or robust). However, our empirical study shows that such specifications fail to certify any test data points, making it impractical for real-world applications. We propose a new family of specifications called “neural representation as specification”, which uses the intrinsic information of neural networks — neural activation patterns (NAP) rather than input data to specify the correctness and/or robustness of neural network predictions. We present a simple statistical approach to extracting dominant neural activation patterns. We analyze NAPs from a statistical point of view and find that a single NAP can cover a large number of training and testing data points whereas ad hoc data-as-specification can only cover a single training data point and often zero testing data points. To show the effectiveness of discovered NAPs, we formally verify several important properties, such as a particular type of misclassification never happens for a given NAP, and there is no ambiguity among different NAPs. We show that by using NAP, we can verify the prediction of the entire input space, while still recalling 84% of the data. Thus, we argue that using NAPs is a more reliable and extensible specification for neural network verification.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)
9 Replies

Loading